Dataset statistics
| Number of variables | 118 |
|---|---|
| Number of observations | 28494 |
| Missing cells | 2902201 |
| Missing cells (%) | 86.3% |
| Total size in memory | 120.6 MiB |
| Average record size in memory | 4.3 KiB |
Variable types
| Categorical | 91 |
|---|---|
| Numeric | 5 |
| Unsupported | 12 |
| URL | 10 |
deviceCheckVoice has constant value "" | Constant |
_id has a high cardinality: 28494 distinct values | High cardinality |
application_id has a high cardinality: 3878 distinct values | High cardinality |
createdAt has a high cardinality: 28494 distinct values | High cardinality |
trial has a high cardinality: 2810 distinct values | High cardinality |
user has a high cardinality: 3835 distinct values | High cardinality |
voice_1.fileName has a high cardinality: 3139 distinct values | High cardinality |
voice_10.fileName has a high cardinality: 3277 distinct values | High cardinality |
voice_2.fileName has a high cardinality: 2923 distinct values | High cardinality |
voice_3.fileName has a high cardinality: 3061 distinct values | High cardinality |
voice_4.fileName has a high cardinality: 3139 distinct values | High cardinality |
voice_5.fileName has a high cardinality: 3061 distinct values | High cardinality |
voice_6.fileName has a high cardinality: 3061 distinct values | High cardinality |
voice_7.fileName has a high cardinality: 3277 distinct values | High cardinality |
voice_8.fileName has a high cardinality: 3139 distinct values | High cardinality |
voice_9.fileName has a high cardinality: 3277 distinct values | High cardinality |
englishGrammar_12 is highly imbalanced (80.5%) | Imbalance |
englishGrammar_13 is highly imbalanced (70.7%) | Imbalance |
englishGrammar_14 is highly imbalanced (60.9%) | Imbalance |
englishGrammar_15 is highly imbalanced (75.7%) | Imbalance |
englishGrammar_16 is highly imbalanced (82.2%) | Imbalance |
englishGrammar_17 is highly imbalanced (50.4%) | Imbalance |
englishGrammar_18 is highly imbalanced (83.4%) | Imbalance |
englishGrammar_2 is highly imbalanced (69.0%) | Imbalance |
englishGrammar_21 is highly imbalanced (52.2%) | Imbalance |
englishGrammar_24 is highly imbalanced (73.5%) | Imbalance |
englishGrammar_25 is highly imbalanced (92.5%) | Imbalance |
englishGrammar_26 is highly imbalanced (90.1%) | Imbalance |
englishGrammar_27 is highly imbalanced (51.9%) | Imbalance |
englishGrammar_3 is highly imbalanced (70.0%) | Imbalance |
englishGrammar_4 is highly imbalanced (63.5%) | Imbalance |
englishGrammar_6 is highly imbalanced (53.2%) | Imbalance |
englishGrammar_8 is highly imbalanced (61.7%) | Imbalance |
listening_1 is highly imbalanced (57.2%) | Imbalance |
listening_3 is highly imbalanced (67.0%) | Imbalance |
listening_4 is highly imbalanced (54.3%) | Imbalance |
listening_9 is highly imbalanced (57.6%) | Imbalance |
readingComprehension_2 is highly imbalanced (72.4%) | Imbalance |
readingComprehension_3 is highly imbalanced (70.4%) | Imbalance |
readingComprehension_5 is highly imbalanced (64.2%) | Imbalance |
situationalJudgement_1 is highly imbalanced (71.7%) | Imbalance |
situationalJudgement_10 is highly imbalanced (75.7%) | Imbalance |
situationalJudgement_11 is highly imbalanced (75.2%) | Imbalance |
situationalJudgement_13 is highly imbalanced (92.2%) | Imbalance |
situationalJudgement_14 is highly imbalanced (92.1%) | Imbalance |
situationalJudgement_15 is highly imbalanced (92.2%) | Imbalance |
situationalJudgement_2 is highly imbalanced (82.9%) | Imbalance |
situationalJudgement_4 is highly imbalanced (86.0%) | Imbalance |
situationalJudgement_5 is highly imbalanced (55.7%) | Imbalance |
situationalJudgement_7 is highly imbalanced (73.1%) | Imbalance |
situationalJudgement_9 is highly imbalanced (80.8%) | Imbalance |
voice_1.prompt is highly imbalanced (58.4%) | Imbalance |
voice_10.prompt is highly imbalanced (93.7%) | Imbalance |
voice_2.prompt is highly imbalanced (89.8%) | Imbalance |
voice_3.prompt is highly imbalanced (57.6%) | Imbalance |
voice_4.prompt is highly imbalanced (53.6%) | Imbalance |
voice_6.prompt is highly imbalanced (56.5%) | Imbalance |
voice_7.prompt is highly imbalanced (59.9%) | Imbalance |
voice_8.prompt is highly imbalanced (58.5%) | Imbalance |
voice_9.prompt is highly imbalanced (96.5%) | Imbalance |
automaticScore has 25212 (88.5%) missing values | Missing |
deviceCheckVoice has 25220 (88.5%) missing values | Missing |
englishGrammar_1 has 26183 (91.9%) missing values | Missing |
englishGrammar_10 has 26998 (94.7%) missing values | Missing |
englishGrammar_11 has 27004 (94.8%) missing values | Missing |
englishGrammar_12 has 27021 (94.8%) missing values | Missing |
englishGrammar_13 has 26986 (94.7%) missing values | Missing |
englishGrammar_14 has 27052 (94.9%) missing values | Missing |
englishGrammar_15 has 26157 (91.8%) missing values | Missing |
englishGrammar_16 has 26115 (91.7%) missing values | Missing |
englishGrammar_17 has 26156 (91.8%) missing values | Missing |
englishGrammar_18 has 26130 (91.7%) missing values | Missing |
englishGrammar_19 has 26585 (93.3%) missing values | Missing |
englishGrammar_2 has 26136 (91.7%) missing values | Missing |
englishGrammar_20 has 26581 (93.3%) missing values | Missing |
englishGrammar_21 has 26595 (93.3%) missing values | Missing |
englishGrammar_22 has 26624 (93.4%) missing values | Missing |
englishGrammar_23 has 27028 (94.9%) missing values | Missing |
englishGrammar_24 has 26993 (94.7%) missing values | Missing |
englishGrammar_25 has 27031 (94.9%) missing values | Missing |
englishGrammar_26 has 27060 (95.0%) missing values | Missing |
englishGrammar_27 has 27097 (95.1%) missing values | Missing |
englishGrammar_28 has 27049 (94.9%) missing values | Missing |
englishGrammar_3 has 26059 (91.5%) missing values | Missing |
englishGrammar_4 has 26084 (91.5%) missing values | Missing |
englishGrammar_5 has 26666 (93.6%) missing values | Missing |
englishGrammar_6 has 26676 (93.6%) missing values | Missing |
englishGrammar_7 has 26646 (93.5%) missing values | Missing |
englishGrammar_8 has 26600 (93.4%) missing values | Missing |
englishGrammar_9 has 27001 (94.8%) missing values | Missing |
final.accuracy has 25449 (89.3%) missing values | Missing |
final.wpm has 25449 (89.3%) missing values | Missing |
listening_1 has 25236 (88.6%) missing values | Missing |
listening_10 has 25222 (88.5%) missing values | Missing |
listening_2 has 25222 (88.5%) missing values | Missing |
listening_3 has 25240 (88.6%) missing values | Missing |
listening_4 has 25219 (88.5%) missing values | Missing |
listening_5 has 25238 (88.6%) missing values | Missing |
listening_6 has 25215 (88.5%) missing values | Missing |
listening_7 has 25210 (88.5%) missing values | Missing |
listening_8 has 25225 (88.5%) missing values | Missing |
listening_9 has 25213 (88.5%) missing values | Missing |
percent has 11232 (39.4%) missing values | Missing |
readingComprehension_1 has 25001 (87.7%) missing values | Missing |
readingComprehension_10 has 25140 (88.2%) missing values | Missing |
readingComprehension_11 has 25377 (89.1%) missing values | Missing |
readingComprehension_2 has 25023 (87.8%) missing values | Missing |
readingComprehension_3 has 25000 (87.7%) missing values | Missing |
readingComprehension_4 has 25007 (87.8%) missing values | Missing |
readingComprehension_5 has 25012 (87.8%) missing values | Missing |
readingComprehension_6 has 25018 (87.8%) missing values | Missing |
readingComprehension_7 has 25000 (87.7%) missing values | Missing |
readingComprehension_8 has 25030 (87.8%) missing values | Missing |
readingComprehension_9 has 25014 (87.8%) missing values | Missing |
score has 11229 (39.4%) missing values | Missing |
scoreBreakdown.pickIncorrect has 28494 (100.0%) missing values | Missing |
scoreBreakdown.tenses has 28494 (100.0%) missing values | Missing |
scoreBreakdown.wordSelection has 28494 (100.0%) missing values | Missing |
situationalJudgement_1 has 25837 (90.7%) missing values | Missing |
situationalJudgement_10 has 26378 (92.6%) missing values | Missing |
situationalJudgement_11 has 25957 (91.1%) missing values | Missing |
situationalJudgement_12 has 26433 (92.8%) missing values | Missing |
situationalJudgement_13 has 25968 (91.1%) missing values | Missing |
situationalJudgement_14 has 25826 (90.6%) missing values | Missing |
situationalJudgement_15 has 26387 (92.6%) missing values | Missing |
situationalJudgement_2 has 26363 (92.5%) missing values | Missing |
situationalJudgement_3 has 26420 (92.7%) missing values | Missing |
situationalJudgement_4 has 26402 (92.7%) missing values | Missing |
situationalJudgement_5 has 26382 (92.6%) missing values | Missing |
situationalJudgement_6 has 26395 (92.6%) missing values | Missing |
situationalJudgement_7 has 26382 (92.6%) missing values | Missing |
situationalJudgement_8 has 26406 (92.7%) missing values | Missing |
situationalJudgement_9 has 25820 (90.6%) missing values | Missing |
total has 11229 (39.4%) missing values | Missing |
trial has 25415 (89.2%) missing values | Missing |
voice_1.GCSData has 28494 (100.0%) missing values | Missing |
voice_1.audioUrl has 25369 (89.0%) missing values | Missing |
voice_1.fileName has 25350 (89.0%) missing values | Missing |
voice_1.prompt has 25350 (89.0%) missing values | Missing |
voice_10.audioUrl has 25237 (88.6%) missing values | Missing |
voice_10.fileName has 25212 (88.5%) missing values | Missing |
voice_10.prompt has 25212 (88.5%) missing values | Missing |
voice_2.GCSData has 28494 (100.0%) missing values | Missing |
voice_2.audioUrl has 25572 (89.7%) missing values | Missing |
voice_2.fileName has 25567 (89.7%) missing values | Missing |
voice_2.prompt has 25567 (89.7%) missing values | Missing |
voice_3.GCSData has 28494 (100.0%) missing values | Missing |
voice_3.audioUrl has 25438 (89.3%) missing values | Missing |
voice_3.fileName has 25429 (89.2%) missing values | Missing |
voice_3.prompt has 25429 (89.2%) missing values | Missing |
voice_4.GCSData has 28494 (100.0%) missing values | Missing |
voice_4.audioUrl has 25369 (89.0%) missing values | Missing |
voice_4.fileName has 25350 (89.0%) missing values | Missing |
voice_4.prompt has 25350 (89.0%) missing values | Missing |
voice_5.GCSData has 28494 (100.0%) missing values | Missing |
voice_5.audioUrl has 25439 (89.3%) missing values | Missing |
voice_5.fileName has 25429 (89.2%) missing values | Missing |
voice_5.prompt has 25429 (89.2%) missing values | Missing |
voice_6.GCSData has 28494 (100.0%) missing values | Missing |
voice_6.audioUrl has 25438 (89.3%) missing values | Missing |
voice_6.fileName has 25429 (89.2%) missing values | Missing |
voice_6.prompt has 25429 (89.2%) missing values | Missing |
voice_7.GCSData has 28494 (100.0%) missing values | Missing |
voice_7.audioUrl has 25237 (88.6%) missing values | Missing |
voice_7.fileName has 25212 (88.5%) missing values | Missing |
voice_7.prompt has 25212 (88.5%) missing values | Missing |
voice_8.GCSData has 28494 (100.0%) missing values | Missing |
voice_8.audioUrl has 25369 (89.0%) missing values | Missing |
voice_8.fileName has 25350 (89.0%) missing values | Missing |
voice_8.prompt has 25350 (89.0%) missing values | Missing |
voice_9.audioUrl has 25236 (88.6%) missing values | Missing |
voice_9.fileName has 25212 (88.5%) missing values | Missing |
voice_9.prompt has 25212 (88.5%) missing values | Missing |
_id is uniformly distributed | Uniform |
application_id is uniformly distributed | Uniform |
createdAt is uniformly distributed | Uniform |
trial is uniformly distributed | Uniform |
user is uniformly distributed | Uniform |
voice_1.fileName is uniformly distributed | Uniform |
voice_10.fileName is uniformly distributed | Uniform |
voice_2.fileName is uniformly distributed | Uniform |
voice_3.fileName is uniformly distributed | Uniform |
voice_4.fileName is uniformly distributed | Uniform |
voice_5.fileName is uniformly distributed | Uniform |
voice_6.fileName is uniformly distributed | Uniform |
voice_7.fileName is uniformly distributed | Uniform |
voice_8.fileName is uniformly distributed | Uniform |
voice_9.fileName is uniformly distributed | Uniform |
_id has unique values | Unique |
createdAt has unique values | Unique |
percent is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
scoreBreakdown.pickIncorrect is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
scoreBreakdown.tenses is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
scoreBreakdown.wordSelection is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
voice_1.GCSData is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
voice_2.GCSData is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
voice_3.GCSData is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
voice_4.GCSData is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
voice_5.GCSData is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
voice_6.GCSData is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
voice_7.GCSData is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
voice_8.GCSData is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
| Analysis started | 2023-02-08 10:40:50.928877 |
|---|---|
| Analysis finished | 2023-02-08 10:41:28.020829 |
| Duration | 37.09 seconds |
| Software version | ydata-profiling v0.0.dev0 |
| Download configuration | config.json |
_id
Categorical
HIGH CARDINALITY  UNIFORM  UNIQUE 
| Distinct | 28494 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| 1 | |
| 1 | |
| 1 | |
| 1 | |
| 1 | |
Length
| Max length | 17 |
|---|---|
| Median length | 17 |
| Mean length | 17 |
| Min length | 17 |
Characters and Unicode
| Total characters | 484398 |
|---|---|
| Distinct characters | 55 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 28494 ? |
|---|---|
| Unique (%) | 100.0% |
Common Values
| Value | Count | Frequency (%) |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (28484) | 28484 |
Length
| Value | Count | Frequency (%) |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (28484) | 28484 |
Most occurring characters
| Value | Count | Frequency (%) |
| 9009 | 1.9% | |
| 8980 | 1.9% | |
| 8977 | 1.9% | |
| 8975 | 1.9% | |
| 8939 | 1.8% | |
| 8920 | 1.8% | |
| 8910 | 1.8% | |
| 8900 | 1.8% | |
| 8900 | 1.8% | |
| 8898 | 1.8% | |
| Other values (45) | 394990 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 220551 | |
| Uppercase Letter | 193595 | |
| Decimal Number | 70252 | 14.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 9009 | 4.1% | |
| 8977 | 4.1% | |
| 8975 | 4.1% | |
| 8939 | 4.1% | |
| 8920 | 4.0% | |
| 8900 | 4.0% | |
| 8900 | 4.0% | |
| 8898 | 4.0% | |
| 8834 | 4.0% | |
| 8821 | 4.0% | |
| Other values (15) | 131378 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 8980 | 4.6% | |
| 8910 | 4.6% | |
| 8890 | 4.6% | |
| 8888 | 4.6% | |
| 8881 | 4.6% | |
| 8856 | 4.6% | |
| 8845 | 4.6% | |
| 8838 | 4.6% | |
| 8818 | 4.6% | |
| 8812 | 4.6% | |
| Other values (12) | 104877 |
Decimal Number
| Value | Count | Frequency (%) |
| 8877 | ||
| 8825 | ||
| 8823 | ||
| 8787 | ||
| 8764 | ||
| 8755 | ||
| 8729 | ||
| 8692 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 414146 | |
| Common | 70252 | 14.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 9009 | 2.2% | |
| 8980 | 2.2% | |
| 8977 | 2.2% | |
| 8975 | 2.2% | |
| 8939 | 2.2% | |
| 8920 | 2.2% | |
| 8910 | 2.2% | |
| 8900 | 2.1% | |
| 8900 | 2.1% | |
| 8898 | 2.1% | |
| Other values (37) | 324738 |
Common
| Value | Count | Frequency (%) |
| 8877 | ||
| 8825 | ||
| 8823 | ||
| 8787 | ||
| 8764 | ||
| 8755 | ||
| 8729 | ||
| 8692 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 484398 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 9009 | 1.9% | |
| 8980 | 1.9% | |
| 8977 | 1.9% | |
| 8975 | 1.9% | |
| 8939 | 1.8% | |
| 8920 | 1.8% | |
| 8910 | 1.8% | |
| 8900 | 1.8% | |
| 8900 | 1.8% | |
| 8898 | 1.8% | |
| Other values (45) | 394990 |
application_id
Categorical
HIGH CARDINALITY  UNIFORM 
| Distinct | 3878 |
|---|---|
| Distinct (%) | 13.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| 11 | |
| 11 | |
| 11 | |
| 11 | |
| 11 | |
Length
| Max length | 17 |
|---|---|
| Median length | 17 |
| Mean length | 17 |
| Min length | 17 |
Characters and Unicode
| Total characters | 484398 |
|---|---|
| Distinct characters | 55 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 85 ? |
|---|---|
| Unique (%) | 0.3% |
Common Values
| Value | Count | Frequency (%) |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| Other values (3868) | 28384 |
Length
| Value | Count | Frequency (%) |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| Other values (3868) | 28384 |
Most occurring characters
| Value | Count | Frequency (%) |
| 9496 | 2.0% | |
| 9405 | 1.9% | |
| 9361 | 1.9% | |
| 9323 | 1.9% | |
| 9292 | 1.9% | |
| 9134 | 1.9% | |
| 9121 | 1.9% | |
| 9090 | 1.9% | |
| 9060 | 1.9% | |
| 9041 | 1.9% | |
| Other values (45) | 392075 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 220382 | |
| Uppercase Letter | 194136 | |
| Decimal Number | 69880 | 14.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 9405 | 4.3% | |
| 9323 | 4.2% | |
| 9292 | 4.2% | |
| 9134 | 4.1% | |
| 9121 | 4.1% | |
| 9060 | 4.1% | |
| 9031 | 4.1% | |
| 8971 | 4.1% | |
| 8965 | 4.1% | |
| 8947 | 4.1% | |
| Other values (15) | 129133 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 9496 | 4.9% | |
| 9361 | 4.8% | |
| 9041 | 4.7% | |
| 9018 | 4.6% | |
| 8969 | 4.6% | |
| 8938 | 4.6% | |
| 8898 | 4.6% | |
| 8894 | 4.6% | |
| 8863 | 4.6% | |
| 8843 | 4.6% | |
| Other values (12) | 103815 |
Decimal Number
| Value | Count | Frequency (%) |
| 9090 | ||
| 9030 | ||
| 8759 | ||
| 8737 | ||
| 8714 | ||
| 8542 | ||
| 8532 | ||
| 8476 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 414518 | |
| Common | 69880 | 14.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 9496 | 2.3% | |
| 9405 | 2.3% | |
| 9361 | 2.3% | |
| 9323 | 2.2% | |
| 9292 | 2.2% | |
| 9134 | 2.2% | |
| 9121 | 2.2% | |
| 9060 | 2.2% | |
| 9041 | 2.2% | |
| 9031 | 2.2% | |
| Other values (37) | 322254 |
Common
| Value | Count | Frequency (%) |
| 9090 | ||
| 9030 | ||
| 8759 | ||
| 8737 | ||
| 8714 | ||
| 8542 | ||
| 8532 | ||
| 8476 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 484398 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 9496 | 2.0% | |
| 9405 | 1.9% | |
| 9361 | 1.9% | |
| 9323 | 1.9% | |
| 9292 | 1.9% | |
| 9134 | 1.9% | |
| 9121 | 1.9% | |
| 9090 | 1.9% | |
| 9060 | 1.9% | |
| 9041 | 1.9% | |
| Other values (45) | 392075 |
automaticScore
Real number (ℝ)
| Distinct | 3062 |
|---|---|
| Distinct (%) | 93.3% |
| Missing | 25212 |
| Missing (%) | 88.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 69.718207 |
| Minimum | 0 |
|---|---|
| Maximum | 96.824233 |
| Zeros | 221 |
| Zeros (%) | 0.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 222.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 58.322642 |
| median | 82.007264 |
| Q3 | 90.090525 |
| 95-th percentile | 93.95771 |
| Maximum | 96.824233 |
| Range | 96.824233 |
| Interquartile range (IQR) | 31.767883 |
Descriptive statistics
| Standard deviation | 27.885505 |
|---|---|
| Coefficient of variation (CV) | 0.3999745 |
| Kurtosis | 0.62540064 |
| Mean | 69.718207 |
| Median Absolute Deviation (MAD) | 9.9783996 |
| Skewness | -1.3243327 |
| Sum | 228815.16 |
| Variance | 777.60141 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 221 | 0.8% |
| 86.5555387 | 1 | < 0.1% |
| 81.25676641 | 1 | < 0.1% |
| 74.67100702 | 1 | < 0.1% |
| 75.91230446 | 1 | < 0.1% |
| 39.76567487 | 1 | < 0.1% |
| 51.4256542 | 1 | < 0.1% |
| 60.29269483 | 1 | < 0.1% |
| 75.12970533 | 1 | < 0.1% |
| 74.38011456 | 1 | < 0.1% |
| Other values (3052) | 3052 | 10.7% |
| (Missing) | 25212 |
| Value | Count | Frequency (%) |
| 0 | 221 | |
| 0.1521550594 | 1 | < 0.1% |
| 0.1872758567 | 1 | < 0.1% |
| 0.2181366425 | 1 | < 0.1% |
| 0.2794063445 | 1 | < 0.1% |
| 0.3978414237 | 1 | < 0.1% |
| 0.6250090081 | 1 | < 0.1% |
| 0.8566477636 | 1 | < 0.1% |
| 1.308530501 | 1 | < 0.1% |
| 1.438634913 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 96.82423253 | 1 | |
| 96.61722445 | 1 | |
| 96.50841309 | 1 | |
| 96.46214604 | 1 | |
| 96.4543435 | 1 | |
| 96.31357894 | 1 | |
| 96.30015264 | 1 | |
| 96.28455193 | 1 | |
| 96.25887704 | 1 | |
| 96.13473984 | 1 |
createdAt
Categorical
HIGH CARDINALITY  UNIFORM  UNIQUE 
| Distinct | 28494 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.2 MiB |
| 1 | |
| 1 | |
| 1 | |
| 1 | |
| 1 | |
Length
| Max length | 24 |
|---|---|
| Median length | 24 |
| Mean length | 24 |
| Min length | 24 |
Characters and Unicode
| Total characters | 683856 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 28494 ? |
|---|---|
| Unique (%) | 100.0% |
Common Values
| Value | Count | Frequency (%) |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (28484) | 28484 |
Length
| Value | Count | Frequency (%) |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (28484) | 28484 |
Most occurring characters
| Value | Count | Frequency (%) |
| 122619 | ||
| 107625 | ||
| 64356 | ||
| 56988 | ||
| 56988 | ||
| 38263 | 5.6% | |
| 29704 | 4.3% | |
| 28494 | 4.2% | |
| 28494 | 4.2% | |
| 28494 | 4.2% | |
| Other values (5) | 121831 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 484398 | |
| Other Punctuation | 85482 | 12.5% |
| Dash Punctuation | 56988 | 8.3% |
| Uppercase Letter | 56988 | 8.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 122619 | ||
| 107625 | ||
| 64356 | ||
| 38263 | 7.9% | |
| 29704 | 6.1% | |
| 28237 | 5.8% | |
| 24301 | 5.0% | |
| 23787 | 4.9% | |
| 23291 | 4.8% | |
| 22215 | 4.6% |
Other Punctuation
| Value | Count | Frequency (%) |
| 56988 | ||
| 28494 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 28494 | ||
| 28494 |
Dash Punctuation
| Value | Count | Frequency (%) |
| 56988 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 626868 | |
| Latin | 56988 | 8.3% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 122619 | ||
| 107625 | ||
| 64356 | ||
| 56988 | ||
| 56988 | ||
| 38263 | 6.1% | |
| 29704 | 4.7% | |
| 28494 | 4.5% | |
| 28237 | 4.5% | |
| 24301 | 3.9% | |
| Other values (3) | 69293 |
Latin
| Value | Count | Frequency (%) |
| 28494 | ||
| 28494 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 683856 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 122619 | ||
| 107625 | ||
| 64356 | ||
| 56988 | ||
| 56988 | ||
| 38263 | 5.6% | |
| 29704 | 4.3% | |
| 28494 | 4.2% | |
| 28494 | 4.2% | |
| 28494 | 4.2% | |
| Other values (5) | 121831 |
deviceCheckVoice
Categorical
CONSTANT  MISSING 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 25220 |
| Missing (%) | 88.5% |
| Memory size | 1012.1 KiB |
Length
| Max length | 13 |
|---|---|
| Median length | 13 |
| Mean length | 13 |
| Min length | 13 |
Characters and Unicode
| Total characters | 42562 |
|---|---|
| Distinct characters | 9 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 3274 | 11.5% | |
| (Missing) | 25220 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3274 |
Most occurring characters
| Value | Count | Frequency (%) |
| 9822 | ||
| 6548 | ||
| 6548 | ||
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 39288 | |
| Uppercase Letter | 3274 | 7.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 9822 | ||
| 6548 | ||
| 6548 | ||
| 3274 | 8.3% | |
| 3274 | 8.3% | |
| 3274 | 8.3% | |
| 3274 | 8.3% | |
| 3274 | 8.3% |
Uppercase Letter
| Value | Count | Frequency (%) |
| 3274 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 42562 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 9822 | ||
| 6548 | ||
| 6548 | ||
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 42562 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 9822 | ||
| 6548 | ||
| 6548 | ||
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% | |
| 3274 | 7.7% |
englishGrammar_1
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26183 |
| Missing (%) | 91.9% |
| Memory size | 975.3 KiB |
| 18 |
Length
| Max length | 13 |
|---|---|
| Median length | 13 |
| Mean length | 12.539161 |
| Min length | 10 |
Characters and Unicode
| Total characters | 28978 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1504 | 5.3% | |
| 567 | 2.0% | |
| 222 | 0.8% | |
| 18 | 0.1% | |
| (Missing) | 26183 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1726 | ||
| 1504 | ||
| 585 | 12.7% | |
| 567 | 12.3% | |
| 222 | 4.8% | |
| 18 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 6348 | ||
| 5541 | ||
| 2311 | 8.0% | |
| 2311 | 8.0% | |
| 2311 | 8.0% | |
| 2311 | 8.0% | |
| 2089 | 7.2% | |
| 2071 | 7.1% | |
| 1726 | 6.0% | |
| 807 | 2.8% | |
| Other values (2) | 1152 | 4.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 24578 | |
| Space Separator | 2311 | 8.0% |
| Other Punctuation | 2089 | 7.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 6348 | ||
| 5541 | ||
| 2311 | 9.4% | |
| 2311 | 9.4% | |
| 2311 | 9.4% | |
| 2071 | 8.4% | |
| 1726 | 7.0% | |
| 807 | 3.3% | |
| 585 | 2.4% | |
| 567 | 2.3% |
Space Separator
| Value | Count | Frequency (%) |
| 2311 |
Other Punctuation
| Value | Count | Frequency (%) |
| 2089 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 24578 | |
| Common | 4400 | 15.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 6348 | ||
| 5541 | ||
| 2311 | 9.4% | |
| 2311 | 9.4% | |
| 2311 | 9.4% | |
| 2071 | 8.4% | |
| 1726 | 7.0% | |
| 807 | 3.3% | |
| 585 | 2.4% | |
| 567 | 2.3% |
Common
| Value | Count | Frequency (%) |
| 2311 | ||
| 2089 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 28978 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6348 | ||
| 5541 | ||
| 2311 | 8.0% | |
| 2311 | 8.0% | |
| 2311 | 8.0% | |
| 2311 | 8.0% | |
| 2089 | 7.2% | |
| 2071 | 7.1% | |
| 1726 | 6.0% | |
| 807 | 2.8% | |
| Other values (2) | 1152 | 4.0% |
englishGrammar_10
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 26998 |
| Missing (%) | 94.7% |
| Memory size | 932.7 KiB |
| 70 | |
| 26 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 3.8389037 |
| Min length | 3 |
Characters and Unicode
| Total characters | 5743 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1159 | 4.1% | |
| 241 | 0.8% | |
| 70 | 0.2% | |
| 26 | 0.1% | |
| (Missing) | 26998 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1159 | ||
| 241 | 16.1% | |
| 70 | 4.7% | |
| 26 | 1.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1400 | ||
| 1159 | ||
| 1159 | ||
| 1159 | ||
| 241 | 4.2% | |
| 241 | 4.2% | |
| 140 | 2.4% | |
| 70 | 1.2% | |
| 70 | 1.2% | |
| 52 | 0.9% | |
| Other values (2) | 52 | 0.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5743 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 1400 | ||
| 1159 | ||
| 1159 | ||
| 1159 | ||
| 241 | 4.2% | |
| 241 | 4.2% | |
| 140 | 2.4% | |
| 70 | 1.2% | |
| 70 | 1.2% | |
| 52 | 0.9% | |
| Other values (2) | 52 | 0.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5743 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 1400 | ||
| 1159 | ||
| 1159 | ||
| 1159 | ||
| 241 | 4.2% | |
| 241 | 4.2% | |
| 140 | 2.4% | |
| 70 | 1.2% | |
| 70 | 1.2% | |
| 52 | 0.9% | |
| Other values (2) | 52 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5743 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1400 | ||
| 1159 | ||
| 1159 | ||
| 1159 | ||
| 241 | 4.2% | |
| 241 | 4.2% | |
| 140 | 2.4% | |
| 70 | 1.2% | |
| 70 | 1.2% | |
| 52 | 0.9% | |
| Other values (2) | 52 | 0.9% |
englishGrammar_11
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 27004 |
| Missing (%) | 94.8% |
| Memory size | 932.6 KiB |
| 82 | |
| 57 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 3.9234899 |
| Min length | 2 |
Characters and Unicode
| Total characters | 5846 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1157 | 4.1% | |
| 194 | 0.7% | |
| 82 | 0.3% | |
| 57 | 0.2% | |
| (Missing) | 27004 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1157 | ||
| 194 | 13.0% | |
| 82 | 5.5% | |
| 57 | 3.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1433 | ||
| 1239 | ||
| 1239 | ||
| 1157 | ||
| 194 | 3.3% | |
| 194 | 3.3% | |
| 194 | 3.3% | |
| 82 | 1.4% | |
| 57 | 1.0% | |
| 57 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5846 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 1433 | ||
| 1239 | ||
| 1239 | ||
| 1157 | ||
| 194 | 3.3% | |
| 194 | 3.3% | |
| 194 | 3.3% | |
| 82 | 1.4% | |
| 57 | 1.0% | |
| 57 | 1.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5846 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 1433 | ||
| 1239 | ||
| 1239 | ||
| 1157 | ||
| 194 | 3.3% | |
| 194 | 3.3% | |
| 194 | 3.3% | |
| 82 | 1.4% | |
| 57 | 1.0% | |
| 57 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5846 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1433 | ||
| 1239 | ||
| 1239 | ||
| 1157 | ||
| 194 | 3.3% | |
| 194 | 3.3% | |
| 194 | 3.3% | |
| 82 | 1.4% | |
| 57 | 1.0% | |
| 57 | 1.0% |
englishGrammar_12
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 27021 |
| Missing (%) | 94.8% |
| Memory size | 938.1 KiB |
| 30 | |
| 28 | |
| 22 |
Length
| Max length | 9 |
|---|---|
| Median length | 8 |
| Mean length | 8.0353021 |
| Min length | 8 |
Characters and Unicode
| Total characters | 11836 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1393 | 4.9% | |
| 30 | 0.1% | |
| 28 | 0.1% | |
| 22 | 0.1% | |
| (Missing) | 27021 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1393 | ||
| 30 | 2.0% | |
| 28 | 1.9% | |
| 22 | 1.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2808 | ||
| 2808 | ||
| 1471 | ||
| 1465 | ||
| 1415 | ||
| 1415 | ||
| 110 | 0.9% | |
| 60 | 0.5% | |
| 56 | 0.5% | |
| 52 | 0.4% | |
| Other values (6) | 176 | 1.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11836 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 2808 | ||
| 2808 | ||
| 1471 | ||
| 1465 | ||
| 1415 | ||
| 1415 | ||
| 110 | 0.9% | |
| 60 | 0.5% | |
| 56 | 0.5% | |
| 52 | 0.4% | |
| Other values (6) | 176 | 1.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11836 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 2808 | ||
| 2808 | ||
| 1471 | ||
| 1465 | ||
| 1415 | ||
| 1415 | ||
| 110 | 0.9% | |
| 60 | 0.5% | |
| 56 | 0.5% | |
| 52 | 0.4% | |
| Other values (6) | 176 | 1.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11836 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2808 | ||
| 2808 | ||
| 1471 | ||
| 1465 | ||
| 1415 | ||
| 1415 | ||
| 110 | 0.9% | |
| 60 | 0.5% | |
| 56 | 0.5% | |
| 52 | 0.4% | |
| Other values (6) | 176 | 1.5% |
englishGrammar_13
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 26986 |
| Missing (%) | 94.7% |
| Memory size | 940.6 KiB |
| 81 | |
| 45 | |
| 18 |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 9.0059682 |
| Min length | 7 |
Characters and Unicode
| Total characters | 13581 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1364 | 4.8% | |
| 81 | 0.3% | |
| 45 | 0.2% | |
| 18 | 0.1% | |
| (Missing) | 26986 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1508 | ||
| 1364 | ||
| 81 | 2.7% | |
| 45 | 1.5% | |
| 18 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2728 | ||
| 1652 | ||
| 1508 | ||
| 1508 | ||
| 1508 | ||
| 1508 | ||
| 1445 | ||
| 1364 | ||
| 99 | 0.7% | |
| 99 | 0.7% | |
| Other values (2) | 162 | 1.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10466 | |
| Other Punctuation | 1607 | 11.8% |
| Space Separator | 1508 | 11.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 2728 | ||
| 1652 | ||
| 1508 | ||
| 1508 | ||
| 1445 | ||
| 1364 | ||
| 99 | 0.9% | |
| 99 | 0.9% | |
| 63 | 0.6% |
Other Punctuation
| Value | Count | Frequency (%) |
| 1508 | ||
| 99 | 6.2% |
Space Separator
| Value | Count | Frequency (%) |
| 1508 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 10466 | |
| Common | 3115 | 22.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 2728 | ||
| 1652 | ||
| 1508 | ||
| 1508 | ||
| 1445 | ||
| 1364 | ||
| 99 | 0.9% | |
| 99 | 0.9% | |
| 63 | 0.6% |
Common
| Value | Count | Frequency (%) |
| 1508 | ||
| 1508 | ||
| 99 | 3.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 13581 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2728 | ||
| 1652 | ||
| 1508 | ||
| 1508 | ||
| 1508 | ||
| 1508 | ||
| 1445 | ||
| 1364 | ||
| 99 | 0.7% | |
| 99 | 0.7% | |
| Other values (2) | 162 | 1.2% |
englishGrammar_14
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 27052 |
| Missing (%) | 94.9% |
| Memory size | 930.5 KiB |
| 23 | |
| 11 |
Length
| Max length | 6 |
|---|---|
| Median length | 3 |
| Mean length | 3.3522885 |
| Min length | 3 |
Characters and Unicode
| Total characters | 4834 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1194 | 4.2% | |
| 214 | 0.8% | |
| 23 | 0.1% | |
| 11 | < 0.1% | |
| (Missing) | 27052 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1194 | ||
| 214 | 14.8% | |
| 23 | 1.6% | |
| 11 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1217 | ||
| 1217 | ||
| 1194 | ||
| 260 | 5.4% | |
| 237 | 4.9% | |
| 237 | 4.9% | |
| 225 | 4.7% | |
| 214 | 4.4% | |
| 22 | 0.5% | |
| 11 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4834 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 1217 | ||
| 1217 | ||
| 1194 | ||
| 260 | 5.4% | |
| 237 | 4.9% | |
| 237 | 4.9% | |
| 225 | 4.7% | |
| 214 | 4.4% | |
| 22 | 0.5% | |
| 11 | 0.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4834 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 1217 | ||
| 1217 | ||
| 1194 | ||
| 260 | 5.4% | |
| 237 | 4.9% | |
| 237 | 4.9% | |
| 225 | 4.7% | |
| 214 | 4.4% | |
| 22 | 0.5% | |
| 11 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4834 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1217 | ||
| 1217 | ||
| 1194 | ||
| 260 | 5.4% | |
| 237 | 4.9% | |
| 237 | 4.9% | |
| 225 | 4.7% | |
| 214 | 4.4% | |
| 22 | 0.5% | |
| 11 | 0.2% |
englishGrammar_15
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26157 |
| Missing (%) | 91.8% |
| Memory size | 972.4 KiB |
| 79 | |
| 54 | |
| 37 |
Length
| Max length | 12 |
|---|---|
| Median length | 11 |
| Mean length | 10.845101 |
| Min length | 8 |
Characters and Unicode
| Total characters | 25345 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2167 | 7.6% | |
| 79 | 0.3% | |
| 54 | 0.2% | |
| 37 | 0.1% | |
| (Missing) | 26157 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2283 | ||
| 2221 | ||
| 54 | 1.2% | |
| 37 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 4558 | ||
| 4504 | ||
| 2337 | ||
| 2337 | ||
| 2337 | ||
| 2337 | ||
| 2283 | ||
| 2283 | ||
| 2258 | ||
| 37 | 0.1% | |
| Other values (2) | 74 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 23087 | |
| Space Separator | 2258 | 8.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4558 | ||
| 4504 | ||
| 2337 | ||
| 2337 | ||
| 2337 | ||
| 2337 | ||
| 2283 | ||
| 2283 | ||
| 37 | 0.2% | |
| 37 | 0.2% |
Space Separator
| Value | Count | Frequency (%) |
| 2258 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 23087 | |
| Common | 2258 | 8.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4558 | ||
| 4504 | ||
| 2337 | ||
| 2337 | ||
| 2337 | ||
| 2337 | ||
| 2283 | ||
| 2283 | ||
| 37 | 0.2% | |
| 37 | 0.2% |
Common
| Value | Count | Frequency (%) |
| 2258 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 25345 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4558 | ||
| 4504 | ||
| 2337 | ||
| 2337 | ||
| 2337 | ||
| 2337 | ||
| 2283 | ||
| 2283 | ||
| 2258 | ||
| 37 | 0.1% | |
| Other values (2) | 74 | 0.3% |
englishGrammar_16
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26115 |
| Missing (%) | 91.7% |
| Memory size | 958.2 KiB |
| 67 | |
| 36 | |
| 16 |
Length
| Max length | 11 |
|---|---|
| Median length | 4 |
| Mean length | 4.1185372 |
| Min length | 4 |
Characters and Unicode
| Total characters | 9798 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2260 | 7.9% | |
| 67 | 0.2% | |
| 36 | 0.1% | |
| 16 | 0.1% | |
| (Missing) | 26115 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2260 | ||
| 83 | 3.5% | |
| 36 | 1.5% | |
| 16 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2462 | ||
| 2379 | ||
| 2379 | ||
| 2379 | ||
| 83 | 0.8% | |
| 36 | 0.4% | |
| 32 | 0.3% | |
| 16 | 0.2% | |
| 16 | 0.2% | |
| 16 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 9782 | |
| Space Separator | 16 | 0.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 2462 | ||
| 2379 | ||
| 2379 | ||
| 2379 | ||
| 83 | 0.8% | |
| 36 | 0.4% | |
| 32 | 0.3% | |
| 16 | 0.2% | |
| 16 | 0.2% |
Space Separator
| Value | Count | Frequency (%) |
| 16 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9782 | |
| Common | 16 | 0.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 2462 | ||
| 2379 | ||
| 2379 | ||
| 2379 | ||
| 83 | 0.8% | |
| 36 | 0.4% | |
| 32 | 0.3% | |
| 16 | 0.2% | |
| 16 | 0.2% |
Common
| Value | Count | Frequency (%) |
| 16 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9798 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2462 | ||
| 2379 | ||
| 2379 | ||
| 2379 | ||
| 83 | 0.8% | |
| 36 | 0.4% | |
| 32 | 0.3% | |
| 16 | 0.2% | |
| 16 | 0.2% | |
| 16 | 0.2% |
englishGrammar_17
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26156 |
| Missing (%) | 91.8% |
| Memory size | 968.4 KiB |
| 34 | |
| 7 |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.0804106 |
| Min length | 2 |
Characters and Unicode
| Total characters | 21230 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1633 | 5.7% | |
| 664 | 2.3% | |
| 34 | 0.1% | |
| 7 | < 0.1% | |
| (Missing) | 26156 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3971 | ||
| 2331 |
Most occurring characters
| Value | Count | Frequency (%) |
| 4662 | ||
| 3971 | ||
| 3971 | ||
| 3964 | ||
| 2331 | ||
| 2331 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 17266 | |
| Space Separator | 3964 | 18.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4662 | ||
| 3971 | ||
| 3971 | ||
| 2331 | ||
| 2331 |
Space Separator
| Value | Count | Frequency (%) |
| 3964 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 17266 | |
| Common | 3964 | 18.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4662 | ||
| 3971 | ||
| 3971 | ||
| 2331 | ||
| 2331 |
Common
| Value | Count | Frequency (%) |
| 3964 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 21230 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4662 | ||
| 3971 | ||
| 3971 | ||
| 3964 | ||
| 2331 | ||
| 2331 |
englishGrammar_18
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26130 |
| Missing (%) | 91.7% |
| Memory size | 987.3 KiB |
| 80 | |
| 22 | |
| 11 |
Length
| Max length | 20 |
|---|---|
| Median length | 17 |
| Mean length | 16.918359 |
| Min length | 8 |
Characters and Unicode
| Total characters | 39995 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2251 | 7.9% | |
| 80 | 0.3% | |
| 22 | 0.1% | |
| 11 | < 0.1% | |
| (Missing) | 26130 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2273 | ||
| 2273 | ||
| 91 | 1.9% | |
| 80 | 1.7% | |
| 33 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 6990 | ||
| 6979 | ||
| 4717 | ||
| 2386 | 6.0% | |
| 2386 | 6.0% | |
| 2386 | 6.0% | |
| 2364 | 5.9% | |
| 2364 | 5.9% | |
| 2364 | 5.9% | |
| 2353 | 5.9% | |
| Other values (3) | 4706 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 37609 | |
| Space Separator | 2386 | 6.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 6990 | ||
| 6979 | ||
| 4717 | ||
| 2386 | 6.3% | |
| 2386 | 6.3% | |
| 2364 | 6.3% | |
| 2364 | 6.3% | |
| 2364 | 6.3% | |
| 2353 | 6.3% | |
| 2353 | 6.3% | |
| Other values (2) | 2353 | 6.3% |
Space Separator
| Value | Count | Frequency (%) |
| 2386 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 37609 | |
| Common | 2386 | 6.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 6990 | ||
| 6979 | ||
| 4717 | ||
| 2386 | 6.3% | |
| 2386 | 6.3% | |
| 2364 | 6.3% | |
| 2364 | 6.3% | |
| 2364 | 6.3% | |
| 2353 | 6.3% | |
| 2353 | 6.3% | |
| Other values (2) | 2353 | 6.3% |
Common
| Value | Count | Frequency (%) |
| 2386 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 39995 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6990 | ||
| 6979 | ||
| 4717 | ||
| 2386 | 6.0% | |
| 2386 | 6.0% | |
| 2386 | 6.0% | |
| 2364 | 5.9% | |
| 2364 | 5.9% | |
| 2364 | 5.9% | |
| 2353 | 5.9% | |
| Other values (3) | 4706 |
englishGrammar_19
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26585 |
| Missing (%) | 93.3% |
| Memory size | 942.2 KiB |
| 141 |
Length
| Max length | 8 |
|---|---|
| Median length | 2 |
| Mean length | 2.7014144 |
| Min length | 2 |
Characters and Unicode
| Total characters | 5157 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1440 | 5.1% | |
| 165 | 0.6% | |
| 163 | 0.6% | |
| 141 | 0.5% | |
| (Missing) | 26585 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1440 | ||
| 165 | 8.6% | |
| 163 | 8.5% | |
| 141 | 7.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1603 | ||
| 1581 | ||
| 610 | 11.8% | |
| 165 | 3.2% | |
| 165 | 3.2% | |
| 165 | 3.2% | |
| 163 | 3.2% | |
| 141 | 2.7% | |
| 141 | 2.7% | |
| 141 | 2.7% | |
| Other values (2) | 282 | 5.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3389 | |
| Uppercase Letter | 1768 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 1581 | ||
| 610 | 18.0% | |
| 165 | 4.9% | |
| 165 | 4.9% | |
| 163 | 4.8% | |
| 141 | 4.2% | |
| 141 | 4.2% | |
| 141 | 4.2% | |
| 141 | 4.2% | |
| 141 | 4.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1603 | ||
| 165 | 9.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5157 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 1603 | ||
| 1581 | ||
| 610 | 11.8% | |
| 165 | 3.2% | |
| 165 | 3.2% | |
| 165 | 3.2% | |
| 163 | 3.2% | |
| 141 | 2.7% | |
| 141 | 2.7% | |
| 141 | 2.7% | |
| Other values (2) | 282 | 5.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5157 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1603 | ||
| 1581 | ||
| 610 | 11.8% | |
| 165 | 3.2% | |
| 165 | 3.2% | |
| 165 | 3.2% | |
| 163 | 3.2% | |
| 141 | 2.7% | |
| 141 | 2.7% | |
| 141 | 2.7% | |
| Other values (2) | 282 | 5.5% |
englishGrammar_2
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26136 |
| Missing (%) | 91.7% |
| Memory size | 960.6 KiB |
| 118 | |
| 59 | |
| 59 |
Length
| Max length | 10 |
|---|---|
| Median length | 5 |
| Mean length | 5.4003393 |
| Min length | 5 |
Characters and Unicode
| Total characters | 12734 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2122 | 7.4% | |
| 118 | 0.4% | |
| 59 | 0.2% | |
| 59 | 0.2% | |
| (Missing) | 26136 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2299 | ||
| 118 | 4.7% | |
| 59 | 2.3% | |
| 59 | 2.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2476 | ||
| 2358 | ||
| 2358 | ||
| 2358 | ||
| 2358 | ||
| 236 | 1.9% | |
| 177 | 1.4% | |
| 118 | 0.9% | |
| 118 | 0.9% | |
| 59 | 0.5% | |
| Other values (2) | 118 | 0.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 12557 | |
| Space Separator | 177 | 1.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 2476 | ||
| 2358 | ||
| 2358 | ||
| 2358 | ||
| 2358 | ||
| 236 | 1.9% | |
| 118 | 0.9% | |
| 118 | 0.9% | |
| 59 | 0.5% | |
| 59 | 0.5% |
Space Separator
| Value | Count | Frequency (%) |
| 177 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 12557 | |
| Common | 177 | 1.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 2476 | ||
| 2358 | ||
| 2358 | ||
| 2358 | ||
| 2358 | ||
| 236 | 1.9% | |
| 118 | 0.9% | |
| 118 | 0.9% | |
| 59 | 0.5% | |
| 59 | 0.5% |
Common
| Value | Count | Frequency (%) |
| 177 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12734 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2476 | ||
| 2358 | ||
| 2358 | ||
| 2358 | ||
| 2358 | ||
| 236 | 1.9% | |
| 177 | 1.4% | |
| 118 | 0.9% | |
| 118 | 0.9% | |
| 59 | 0.5% | |
| Other values (2) | 118 | 0.9% |
englishGrammar_20
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26581 |
| Missing (%) | 93.3% |
| Memory size | 948.9 KiB |
| 92 |
Length
| Max length | 9 |
|---|---|
| Median length | 7 |
| Mean length | 6.2205959 |
| Min length | 2 |
Characters and Unicode
| Total characters | 11900 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1345 | 4.7% | |
| 335 | 1.2% | |
| 141 | 0.5% | |
| 92 | 0.3% | |
| (Missing) | 26581 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1345 | ||
| 335 | 17.5% | |
| 141 | 7.4% | |
| 92 | 4.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2690 | ||
| 1719 | ||
| 1680 | ||
| 1670 | ||
| 1437 | ||
| 1345 | ||
| 476 | 4.0% | |
| 233 | 2.0% | |
| 184 | 1.5% | |
| 141 | 1.2% | |
| Other values (3) | 325 | 2.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11900 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 2690 | ||
| 1719 | ||
| 1680 | ||
| 1670 | ||
| 1437 | ||
| 1345 | ||
| 476 | 4.0% | |
| 233 | 2.0% | |
| 184 | 1.5% | |
| 141 | 1.2% | |
| Other values (3) | 325 | 2.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11900 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 2690 | ||
| 1719 | ||
| 1680 | ||
| 1670 | ||
| 1437 | ||
| 1345 | ||
| 476 | 4.0% | |
| 233 | 2.0% | |
| 184 | 1.5% | |
| 141 | 1.2% | |
| Other values (3) | 325 | 2.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11900 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2690 | ||
| 1719 | ||
| 1680 | ||
| 1670 | ||
| 1437 | ||
| 1345 | ||
| 476 | 4.0% | |
| 233 | 2.0% | |
| 184 | 1.5% | |
| 141 | 1.2% | |
| Other values (3) | 325 | 2.7% |
englishGrammar_21
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26595 |
| Missing (%) | 93.3% |
| Memory size | 944.5 KiB |
| 92 | |
| 54 |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.0968931 |
| Min length | 4 |
Characters and Unicode
| Total characters | 7780 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1540 | 5.4% | |
| 213 | 0.7% | |
| 92 | 0.3% | |
| 54 | 0.2% | |
| (Missing) | 26595 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1540 | ||
| 213 | 11.2% | |
| 92 | 4.8% | |
| 54 | 2.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1845 | ||
| 1540 | ||
| 1540 | ||
| 1540 | ||
| 267 | 3.4% | |
| 213 | 2.7% | |
| 213 | 2.7% | |
| 184 | 2.4% | |
| 146 | 1.9% | |
| 92 | 1.2% | |
| Other values (3) | 200 | 2.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6240 | |
| Uppercase Letter | 1540 | 19.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 1845 | ||
| 1540 | ||
| 1540 | ||
| 267 | 4.3% | |
| 213 | 3.4% | |
| 213 | 3.4% | |
| 184 | 2.9% | |
| 146 | 2.3% | |
| 92 | 1.5% | |
| 92 | 1.5% | |
| Other values (2) | 108 | 1.7% |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1540 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7780 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 1845 | ||
| 1540 | ||
| 1540 | ||
| 1540 | ||
| 267 | 3.4% | |
| 213 | 2.7% | |
| 213 | 2.7% | |
| 184 | 2.4% | |
| 146 | 1.9% | |
| 92 | 1.2% | |
| Other values (3) | 200 | 2.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7780 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1845 | ||
| 1540 | ||
| 1540 | ||
| 1540 | ||
| 267 | 3.4% | |
| 213 | 2.7% | |
| 213 | 2.7% | |
| 184 | 2.4% | |
| 146 | 1.9% | |
| 92 | 1.2% | |
| Other values (3) | 200 | 2.6% |
englishGrammar_22
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26624 |
| Missing (%) | 93.4% |
| Memory size | 947.6 KiB |
| 105 |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 6.2486631 |
| Min length | 3 |
Characters and Unicode
| Total characters | 11685 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1094 | 3.8% | |
| 500 | 1.8% | |
| 171 | 0.6% | |
| 105 | 0.4% | |
| (Missing) | 26624 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1094 | ||
| 500 | ||
| 171 | 9.1% | |
| 105 | 5.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2359 | ||
| 2188 | ||
| 1304 | ||
| 1094 | ||
| 1094 | ||
| 1094 | ||
| 671 | 5.7% | |
| 605 | 5.2% | |
| 500 | 4.3% | |
| 500 | 4.3% | |
| Other values (2) | 276 | 2.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11685 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 2359 | ||
| 2188 | ||
| 1304 | ||
| 1094 | ||
| 1094 | ||
| 1094 | ||
| 671 | 5.7% | |
| 605 | 5.2% | |
| 500 | 4.3% | |
| 500 | 4.3% | |
| Other values (2) | 276 | 2.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11685 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 2359 | ||
| 2188 | ||
| 1304 | ||
| 1094 | ||
| 1094 | ||
| 1094 | ||
| 671 | 5.7% | |
| 605 | 5.2% | |
| 500 | 4.3% | |
| 500 | 4.3% | |
| Other values (2) | 276 | 2.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11685 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2359 | ||
| 2188 | ||
| 1304 | ||
| 1094 | ||
| 1094 | ||
| 1094 | ||
| 671 | 5.7% | |
| 605 | 5.2% | |
| 500 | 4.3% | |
| 500 | 4.3% | |
| Other values (2) | 276 | 2.4% |
englishGrammar_23
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 27028 |
| Missing (%) | 94.9% |
| Memory size | 936.6 KiB |
| 22 |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 7.1446112 |
| Min length | 5 |
Characters and Unicode
| Total characters | 10474 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1048 | 3.7% | |
| 278 | 1.0% | |
| 118 | 0.4% | |
| 22 | 0.1% | |
| (Missing) | 27028 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1048 | ||
| 278 | 19.0% | |
| 118 | 8.0% | |
| 22 | 1.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2374 | ||
| 1348 | ||
| 1070 | ||
| 1048 | ||
| 1048 | ||
| 1048 | ||
| 1048 | ||
| 300 | 2.9% | |
| 278 | 2.7% | |
| 278 | 2.7% | |
| Other values (4) | 634 | 6.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10474 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 2374 | ||
| 1348 | ||
| 1070 | ||
| 1048 | ||
| 1048 | ||
| 1048 | ||
| 1048 | ||
| 300 | 2.9% | |
| 278 | 2.7% | |
| 278 | 2.7% | |
| Other values (4) | 634 | 6.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 10474 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 2374 | ||
| 1348 | ||
| 1070 | ||
| 1048 | ||
| 1048 | ||
| 1048 | ||
| 1048 | ||
| 300 | 2.9% | |
| 278 | 2.7% | |
| 278 | 2.7% | |
| Other values (4) | 634 | 6.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 10474 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2374 | ||
| 1348 | ||
| 1070 | ||
| 1048 | ||
| 1048 | ||
| 1048 | ||
| 1048 | ||
| 300 | 2.9% | |
| 278 | 2.7% | |
| 278 | 2.7% | |
| Other values (4) | 634 | 6.1% |
englishGrammar_24
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 26993 |
| Missing (%) | 94.7% |
| Memory size | 936.4 KiB |
| 106 | |
| 25 | |
| 7 |
Length
| Max length | 10 |
|---|---|
| Median length | 6 |
| Mean length | 6.2738175 |
| Min length | 5 |
Characters and Unicode
| Total characters | 9417 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1363 | 4.8% | |
| 106 | 0.4% | |
| 25 | 0.1% | |
| 7 | < 0.1% | |
| (Missing) | 26993 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1363 | ||
| 106 | 6.9% | |
| 25 | 1.6% | |
| 25 | 1.6% | |
| 7 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2839 | ||
| 1600 | ||
| 1469 | ||
| 1388 | ||
| 1370 | ||
| 319 | 3.4% | |
| 138 | 1.5% | |
| 113 | 1.2% | |
| 106 | 1.1% | |
| 25 | 0.3% | |
| Other values (2) | 50 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 9392 | |
| Space Separator | 25 | 0.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 2839 | ||
| 1600 | ||
| 1469 | ||
| 1388 | ||
| 1370 | ||
| 319 | 3.4% | |
| 138 | 1.5% | |
| 113 | 1.2% | |
| 106 | 1.1% | |
| 25 | 0.3% |
Space Separator
| Value | Count | Frequency (%) |
| 25 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9392 | |
| Common | 25 | 0.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 2839 | ||
| 1600 | ||
| 1469 | ||
| 1388 | ||
| 1370 | ||
| 319 | 3.4% | |
| 138 | 1.5% | |
| 113 | 1.2% | |
| 106 | 1.1% | |
| 25 | 0.3% |
Common
| Value | Count | Frequency (%) |
| 25 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9417 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2839 | ||
| 1600 | ||
| 1469 | ||
| 1388 | ||
| 1370 | ||
| 319 | 3.4% | |
| 138 | 1.5% | |
| 113 | 1.2% | |
| 106 | 1.1% | |
| 25 | 0.3% | |
| Other values (2) | 50 | 0.5% |
englishGrammar_25
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 27031 |
| Missing (%) | 94.9% |
| Memory size | 932.0 KiB |
| 13 | |
| 7 | |
| 5 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 3.9781271 |
| Min length | 2 |
Characters and Unicode
| Total characters | 5820 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1438 | 5.0% | |
| 13 | < 0.1% | |
| 7 | < 0.1% | |
| 5 | < 0.1% | |
| (Missing) | 27031 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1438 | ||
| 13 | 0.9% | |
| 7 | 0.5% | |
| 5 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1463 | ||
| 1463 | ||
| 1451 | ||
| 1438 | ||
| 5 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5820 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 1463 | ||
| 1463 | ||
| 1451 | ||
| 1438 | ||
| 5 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5820 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 1463 | ||
| 1463 | ||
| 1451 | ||
| 1438 | ||
| 5 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5820 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1463 | ||
| 1463 | ||
| 1451 | ||
| 1438 | ||
| 5 | 0.1% |
englishGrammar_26
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 27060 |
| Missing (%) | 95.0% |
| Memory size | 941.0 KiB |
| 14 | |
| 13 | |
| 7 |
Length
| Max length | 13 |
|---|---|
| Median length | 11 |
| Mean length | 11.011158 |
| Min length | 9 |
Characters and Unicode
| Total characters | 15790 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1400 | 4.9% | |
| 14 | < 0.1% | |
| 13 | < 0.1% | |
| 7 | < 0.1% | |
| (Missing) | 27060 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1400 | ||
| 14 | 1.0% | |
| 13 | 0.9% | |
| 7 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2814 | ||
| 2807 | ||
| 1461 | ||
| 1441 | ||
| 1435 | ||
| 1434 | ||
| 1414 | ||
| 1407 | ||
| 1400 | ||
| 41 | 0.3% | |
| Other values (6) | 136 | 0.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 15790 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 2814 | ||
| 2807 | ||
| 1461 | ||
| 1441 | ||
| 1435 | ||
| 1434 | ||
| 1414 | ||
| 1407 | ||
| 1400 | ||
| 41 | 0.3% | |
| Other values (6) | 136 | 0.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 15790 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 2814 | ||
| 2807 | ||
| 1461 | ||
| 1441 | ||
| 1435 | ||
| 1434 | ||
| 1414 | ||
| 1407 | ||
| 1400 | ||
| 41 | 0.3% | |
| Other values (6) | 136 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15790 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2814 | ||
| 2807 | ||
| 1461 | ||
| 1441 | ||
| 1435 | ||
| 1434 | ||
| 1414 | ||
| 1407 | ||
| 1400 | ||
| 41 | 0.3% | |
| Other values (6) | 136 | 0.9% |
englishGrammar_27
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 27097 |
| Missing (%) | 95.1% |
| Memory size | 932.3 KiB |
| 94 | |
| 87 | |
| 70 |
Length
| Max length | 14 |
|---|---|
| Median length | 5 |
| Mean length | 5.5934145 |
| Min length | 4 |
Characters and Unicode
| Total characters | 7814 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1146 | 4.0% | |
| 94 | 0.3% | |
| 87 | 0.3% | |
| 70 | 0.2% | |
| (Missing) | 27097 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1146 | ||
| 94 | 5.9% | |
| 94 | 5.9% | |
| 94 | 5.9% | |
| 87 | 5.5% | |
| 70 | 4.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1491 | ||
| 1421 | ||
| 1310 | ||
| 1146 | ||
| 1146 | ||
| 282 | 3.6% | |
| 258 | 3.3% | |
| 258 | 3.3% | |
| 188 | 2.4% | |
| 87 | 1.1% | |
| Other values (3) | 227 | 2.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 7626 | |
| Space Separator | 188 | 2.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 1491 | ||
| 1421 | ||
| 1310 | ||
| 1146 | ||
| 1146 | ||
| 282 | 3.7% | |
| 258 | 3.4% | |
| 258 | 3.4% | |
| 87 | 1.1% | |
| 87 | 1.1% | |
| Other values (2) | 140 | 1.8% |
Space Separator
| Value | Count | Frequency (%) |
| 188 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7626 | |
| Common | 188 | 2.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 1491 | ||
| 1421 | ||
| 1310 | ||
| 1146 | ||
| 1146 | ||
| 282 | 3.7% | |
| 258 | 3.4% | |
| 258 | 3.4% | |
| 87 | 1.1% | |
| 87 | 1.1% | |
| Other values (2) | 140 | 1.8% |
Common
| Value | Count | Frequency (%) |
| 188 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7814 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1491 | ||
| 1421 | ||
| 1310 | ||
| 1146 | ||
| 1146 | ||
| 282 | 3.6% | |
| 258 | 3.3% | |
| 258 | 3.3% | |
| 188 | 2.4% | |
| 87 | 1.1% | |
| Other values (3) | 227 | 2.9% |
englishGrammar_28
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 27049 |
| Missing (%) | 94.9% |
| Memory size | 932.1 KiB |
| 25 |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4.4352941 |
| Min length | 4 |
Characters and Unicode
| Total characters | 6409 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 669 | 2.3% | |
| 604 | 2.1% | |
| 147 | 0.5% | |
| 25 | 0.1% | |
| (Missing) | 27049 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 669 | ||
| 604 | ||
| 147 | 10.2% | |
| 25 | 1.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1445 | ||
| 1445 | ||
| 1298 | ||
| 841 | ||
| 604 | ||
| 604 | ||
| 147 | 2.3% | |
| 25 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6409 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 1445 | ||
| 1445 | ||
| 1298 | ||
| 841 | ||
| 604 | ||
| 604 | ||
| 147 | 2.3% | |
| 25 | 0.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6409 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 1445 | ||
| 1445 | ||
| 1298 | ||
| 841 | ||
| 604 | ||
| 604 | ||
| 147 | 2.3% | |
| 25 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6409 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1445 | ||
| 1445 | ||
| 1298 | ||
| 841 | ||
| 604 | ||
| 604 | ||
| 147 | 2.3% | |
| 25 | 0.4% |
englishGrammar_3
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26059 |
| Missing (%) | 91.5% |
| Memory size | 973.2 KiB |
| 121 | |
| 115 | |
| 12 |
Length
| Max length | 13 |
|---|---|
| Median length | 10 |
| Mean length | 9.7687885 |
| Min length | 6 |
Characters and Unicode
| Total characters | 23787 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2187 | 7.7% | |
| 121 | 0.4% | |
| 115 | 0.4% | |
| 12 | < 0.1% | |
| (Missing) | 26059 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2199 | ||
| 236 | 9.2% | |
| 127 | 5.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 4634 | ||
| 2562 | ||
| 2435 | ||
| 2435 | ||
| 2435 | ||
| 2435 | ||
| 2199 | ||
| 2199 | ||
| 2199 | ||
| 127 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 23660 | |
| Space Separator | 127 | 0.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4634 | ||
| 2562 | ||
| 2435 | ||
| 2435 | ||
| 2435 | ||
| 2435 | ||
| 2199 | ||
| 2199 | ||
| 2199 | ||
| 127 | 0.5% |
Space Separator
| Value | Count | Frequency (%) |
| 127 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 23660 | |
| Common | 127 | 0.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4634 | ||
| 2562 | ||
| 2435 | ||
| 2435 | ||
| 2435 | ||
| 2435 | ||
| 2199 | ||
| 2199 | ||
| 2199 | ||
| 127 | 0.5% |
Common
| Value | Count | Frequency (%) |
| 127 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 23787 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4634 | ||
| 2562 | ||
| 2435 | ||
| 2435 | ||
| 2435 | ||
| 2435 | ||
| 2199 | ||
| 2199 | ||
| 2199 | ||
| 127 | 0.5% |
englishGrammar_4
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26084 |
| Missing (%) | 91.5% |
| Memory size | 993.5 KiB |
| 128 | |
| 121 | |
| 52 |
Length
| Max length | 19 |
|---|---|
| Median length | 19 |
| Mean length | 18.728631 |
| Min length | 16 |
Characters and Unicode
| Total characters | 45136 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2109 | 7.4% | |
| 128 | 0.4% | |
| 121 | 0.4% | |
| 52 | 0.2% | |
| (Missing) | 26084 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2358 | ||
| 2282 | ||
| 2237 | ||
| 249 | 3.5% | |
| 52 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 4820 | ||
| 4820 | ||
| 4768 | ||
| 4692 | ||
| 2531 | 5.6% | |
| 2410 | 5.3% | |
| 2410 | 5.3% | |
| 2410 | 5.3% | |
| 2410 | 5.3% | |
| 2289 | 5.1% | |
| Other values (6) | 11576 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 40368 | |
| Space Separator | 4768 | 10.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4820 | ||
| 4820 | ||
| 4692 | ||
| 2531 | 6.3% | |
| 2410 | 6.0% | |
| 2410 | 6.0% | |
| 2410 | 6.0% | |
| 2410 | 6.0% | |
| 2289 | 5.7% | |
| 2289 | 5.7% | |
| Other values (5) | 9287 |
Space Separator
| Value | Count | Frequency (%) |
| 4768 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 40368 | |
| Common | 4768 | 10.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4820 | ||
| 4820 | ||
| 4692 | ||
| 2531 | 6.3% | |
| 2410 | 6.0% | |
| 2410 | 6.0% | |
| 2410 | 6.0% | |
| 2410 | 6.0% | |
| 2289 | 5.7% | |
| 2289 | 5.7% | |
| Other values (5) | 9287 |
Common
| Value | Count | Frequency (%) |
| 4768 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 45136 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4820 | ||
| 4820 | ||
| 4768 | ||
| 4692 | ||
| 2531 | 5.6% | |
| 2410 | 5.3% | |
| 2410 | 5.3% | |
| 2410 | 5.3% | |
| 2410 | 5.3% | |
| 2289 | 5.1% | |
| Other values (6) | 11576 |
englishGrammar_5
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26666 |
| Missing (%) | 93.6% |
| Memory size | 943.5 KiB |
Length
| Max length | 8 |
|---|---|
| Median length | 4 |
| Mean length | 4.6438731 |
| Min length | 3 |
Characters and Unicode
| Total characters | 8489 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1238 | 4.3% | |
| 234 | 0.8% | |
| 213 | 0.7% | |
| 143 | 0.5% | |
| (Missing) | 26666 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1238 | ||
| 234 | 12.8% | |
| 213 | 11.7% | |
| 143 | 7.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1451 | ||
| 1381 | ||
| 1238 | ||
| 1238 | ||
| 824 | ||
| 447 | 5.3% | |
| 447 | 5.3% | |
| 356 | 4.2% | |
| 234 | 2.8% | |
| 234 | 2.8% | |
| Other values (3) | 639 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 8021 | |
| Uppercase Letter | 234 | 2.8% |
| Other Punctuation | 234 | 2.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 1451 | ||
| 1381 | ||
| 1238 | ||
| 1238 | ||
| 824 | ||
| 447 | 5.6% | |
| 447 | 5.6% | |
| 356 | 4.4% | |
| 213 | 2.7% | |
| 213 | 2.7% |
Uppercase Letter
| Value | Count | Frequency (%) |
| 234 |
Other Punctuation
| Value | Count | Frequency (%) |
| 234 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 8255 | |
| Common | 234 | 2.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 1451 | ||
| 1381 | ||
| 1238 | ||
| 1238 | ||
| 824 | ||
| 447 | 5.4% | |
| 447 | 5.4% | |
| 356 | 4.3% | |
| 234 | 2.8% | |
| 213 | 2.6% | |
| Other values (2) | 426 | 5.2% |
Common
| Value | Count | Frequency (%) |
| 234 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8489 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1451 | ||
| 1381 | ||
| 1238 | ||
| 1238 | ||
| 824 | ||
| 447 | 5.3% | |
| 447 | 5.3% | |
| 356 | 4.2% | |
| 234 | 2.8% | |
| 234 | 2.8% | |
| Other values (3) | 639 |
englishGrammar_6
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26676 |
| Missing (%) | 93.6% |
| Memory size | 945.4 KiB |
| 139 | |
| 107 | |
| 71 |
Length
| Max length | 8 |
|---|---|
| Median length | 6 |
| Mean length | 5.9075908 |
| Min length | 3 |
Characters and Unicode
| Total characters | 10740 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1501 | 5.3% | |
| 139 | 0.5% | |
| 107 | 0.4% | |
| 71 | 0.2% | |
| (Missing) | 26676 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1501 | ||
| 139 | 7.2% | |
| 107 | 5.6% | |
| 107 | 5.6% | |
| 71 | 3.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1711 | ||
| 1679 | ||
| 1608 | ||
| 1572 | ||
| 1501 | ||
| 1501 | ||
| 495 | 4.6% | |
| 142 | 1.3% | |
| 139 | 1.3% | |
| 107 | 1.0% | |
| Other values (3) | 285 | 2.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10633 | |
| Space Separator | 107 | 1.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 1711 | ||
| 1679 | ||
| 1608 | ||
| 1572 | ||
| 1501 | ||
| 1501 | ||
| 495 | 4.7% | |
| 142 | 1.3% | |
| 139 | 1.3% | |
| 107 | 1.0% | |
| Other values (2) | 178 | 1.7% |
Space Separator
| Value | Count | Frequency (%) |
| 107 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 10633 | |
| Common | 107 | 1.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 1711 | ||
| 1679 | ||
| 1608 | ||
| 1572 | ||
| 1501 | ||
| 1501 | ||
| 495 | 4.7% | |
| 142 | 1.3% | |
| 139 | 1.3% | |
| 107 | 1.0% | |
| Other values (2) | 178 | 1.7% |
Common
| Value | Count | Frequency (%) |
| 107 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 10740 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1711 | ||
| 1679 | ||
| 1608 | ||
| 1572 | ||
| 1501 | ||
| 1501 | ||
| 495 | 4.6% | |
| 142 | 1.3% | |
| 139 | 1.3% | |
| 107 | 1.0% | |
| Other values (3) | 285 | 2.7% |
englishGrammar_7
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26646 |
| Missing (%) | 93.5% |
| Memory size | 945.1 KiB |
Length
| Max length | 9 |
|---|---|
| Median length | 5 |
| Mean length | 5.2245671 |
| Min length | 2 |
Characters and Unicode
| Total characters | 9655 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1097 | 3.8% | |
| 307 | 1.1% | |
| 224 | 0.8% | |
| 220 | 0.8% | |
| (Missing) | 26646 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1097 | ||
| 307 | 16.6% | |
| 224 | 12.1% | |
| 220 | 11.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3306 | ||
| 1541 | ||
| 1541 | ||
| 1404 | ||
| 531 | 5.5% | |
| 224 | 2.3% | |
| 224 | 2.3% | |
| 224 | 2.3% | |
| 220 | 2.3% | |
| 220 | 2.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 9215 | |
| Uppercase Letter | 220 | 2.3% |
| Other Punctuation | 220 | 2.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 3306 | ||
| 1541 | ||
| 1541 | ||
| 1404 | ||
| 531 | 5.8% | |
| 224 | 2.4% | |
| 224 | 2.4% | |
| 224 | 2.4% | |
| 220 | 2.4% |
Uppercase Letter
| Value | Count | Frequency (%) |
| 220 |
Other Punctuation
| Value | Count | Frequency (%) |
| 220 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9435 | |
| Common | 220 | 2.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 3306 | ||
| 1541 | ||
| 1541 | ||
| 1404 | ||
| 531 | 5.6% | |
| 224 | 2.4% | |
| 224 | 2.4% | |
| 224 | 2.4% | |
| 220 | 2.3% | |
| 220 | 2.3% |
Common
| Value | Count | Frequency (%) |
| 220 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9655 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 3306 | ||
| 1541 | ||
| 1541 | ||
| 1404 | ||
| 531 | 5.5% | |
| 224 | 2.3% | |
| 224 | 2.3% | |
| 224 | 2.3% | |
| 220 | 2.3% | |
| 220 | 2.3% |
englishGrammar_8
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26600 |
| Missing (%) | 93.4% |
| Memory size | 954.9 KiB |
| 137 | |
| 62 | |
| 56 |
Length
| Max length | 12 |
|---|---|
| Median length | 10 |
| Mean length | 9.7740232 |
| Min length | 3 |
Characters and Unicode
| Total characters | 18512 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1639 | 5.8% | |
| 137 | 0.5% | |
| 62 | 0.2% | |
| 56 | 0.2% | |
| (Missing) | 26600 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1639 | ||
| 137 | 7.2% | |
| 62 | 3.3% | |
| 56 | 3.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3415 | ||
| 1975 | ||
| 1913 | ||
| 1894 | ||
| 1776 | ||
| 1776 | ||
| 1701 | ||
| 1639 | ||
| 1639 | ||
| 274 | 1.5% | |
| Other values (4) | 510 | 2.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 18512 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 3415 | ||
| 1975 | ||
| 1913 | ||
| 1894 | ||
| 1776 | ||
| 1776 | ||
| 1701 | ||
| 1639 | ||
| 1639 | ||
| 274 | 1.5% | |
| Other values (4) | 510 | 2.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 18512 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 3415 | ||
| 1975 | ||
| 1913 | ||
| 1894 | ||
| 1776 | ||
| 1776 | ||
| 1701 | ||
| 1639 | ||
| 1639 | ||
| 274 | 1.5% | |
| Other values (4) | 510 | 2.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 18512 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 3415 | ||
| 1975 | ||
| 1913 | ||
| 1894 | ||
| 1776 | ||
| 1776 | ||
| 1701 | ||
| 1639 | ||
| 1639 | ||
| 274 | 1.5% | |
| Other values (4) | 510 | 2.8% |
englishGrammar_9
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 27001 |
| Missing (%) | 94.8% |
| Memory size | 936.0 KiB |
| 26 | |
| 16 |
Length
| Max length | 7 |
|---|---|
| Median length | 6 |
| Mean length | 6.1828533 |
| Min length | 3 |
Characters and Unicode
| Total characters | 9231 |
|---|---|
| Distinct characters | 9 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1068 | 3.7% | |
| 383 | 1.3% | |
| 26 | 0.1% | |
| 16 | 0.1% | |
| (Missing) | 27001 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1068 | ||
| 383 | 25.7% | |
| 26 | 1.7% | |
| 16 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1451 | ||
| 1451 | ||
| 1451 | ||
| 1094 | ||
| 1094 | ||
| 1094 | ||
| 798 | ||
| 399 | 4.3% | |
| 399 | 4.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 9231 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 1451 | ||
| 1451 | ||
| 1451 | ||
| 1094 | ||
| 1094 | ||
| 1094 | ||
| 798 | ||
| 399 | 4.3% | |
| 399 | 4.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9231 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 1451 | ||
| 1451 | ||
| 1451 | ||
| 1094 | ||
| 1094 | ||
| 1094 | ||
| 798 | ||
| 399 | 4.3% | |
| 399 | 4.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9231 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1451 | ||
| 1451 | ||
| 1451 | ||
| 1094 | ||
| 1094 | ||
| 1094 | ||
| 798 | ||
| 399 | 4.3% | |
| 399 | 4.3% |
final.accuracy
Real number (ℝ)
| Distinct | 742 |
|---|---|
| Distinct (%) | 24.4% |
| Missing | 25449 |
| Missing (%) | 89.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 92.063202 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 26 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 222.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 26.592 |
| Q1 | 98.14 |
| median | 99.67 |
| Q3 | 100 |
| 95-th percentile | 100 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 1.86 |
Descriptive statistics
| Standard deviation | 21.422897 |
|---|---|
| Coefficient of variation (CV) | 0.23269772 |
| Kurtosis | 8.6335598 |
| Mean | 92.063202 |
| Median Absolute Deviation (MAD) | 0.33 |
| Skewness | -3.1187167 |
| Sum | 280332.45 |
| Variance | 458.94052 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 100 | 1494 | 5.2% |
| 0 | 26 | 0.1% |
| 99.44 | 19 | 0.1% |
| 99.4 | 18 | 0.1% |
| 99.53 | 18 | 0.1% |
| 99.29 | 17 | 0.1% |
| 99.35 | 16 | 0.1% |
| 99.39 | 16 | 0.1% |
| 99.45 | 14 | < 0.1% |
| 99.52 | 14 | < 0.1% |
| Other values (732) | 1393 | 4.9% |
| (Missing) | 25449 |
| Value | Count | Frequency (%) |
| 0 | 26 | |
| 1.01 | 1 | < 0.1% |
| 1.56 | 1 | < 0.1% |
| 3.13 | 1 | < 0.1% |
| 3.23 | 1 | < 0.1% |
| 3.25 | 1 | < 0.1% |
| 3.47 | 1 | < 0.1% |
| 4.29 | 1 | < 0.1% |
| 4.85 | 1 | < 0.1% |
| 4.92 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 1494 | |
| 99.79 | 2 | < 0.1% |
| 99.77 | 1 | < 0.1% |
| 99.73 | 5 | < 0.1% |
| 99.72 | 4 | < 0.1% |
| 99.71 | 1 | < 0.1% |
| 99.7 | 1 | < 0.1% |
| 99.69 | 6 | < 0.1% |
| 99.68 | 4 | < 0.1% |
| 99.67 | 6 | < 0.1% |
final.wpm
Real number (ℝ)
| Distinct | 360 |
|---|---|
| Distinct (%) | 11.8% |
| Missing | 25449 |
| Missing (%) | 89.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.777603 |
| Minimum | 0 |
|---|---|
| Maximum | 143.4 |
| Zeros | 24 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 222.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 15.8 |
| Q1 | 27.6 |
| median | 34.6 |
| Q3 | 42.6 |
| 95-th percentile | 58.6 |
| Maximum | 143.4 |
| Range | 143.4 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 14.576262 |
|---|---|
| Coefficient of variation (CV) | 0.40741305 |
| Kurtosis | 6.6699444 |
| Mean | 35.777603 |
| Median Absolute Deviation (MAD) | 7.6 |
| Skewness | 1.3918735 |
| Sum | 108942.8 |
| Variance | 212.46742 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 33.4 | 52 | 0.2% |
| 35.6 | 45 | 0.2% |
| 36.4 | 38 | 0.1% |
| 33.2 | 36 | 0.1% |
| 33 | 33 | 0.1% |
| 32.6 | 31 | 0.1% |
| 34.6 | 30 | 0.1% |
| 33.6 | 30 | 0.1% |
| 28.2 | 30 | 0.1% |
| 30.2 | 30 | 0.1% |
| Other values (350) | 2690 | 9.4% |
| (Missing) | 25449 |
| Value | Count | Frequency (%) |
| 0 | 24 | |
| 0.6 | 1 | < 0.1% |
| 1.2 | 2 | < 0.1% |
| 1.4 | 1 | < 0.1% |
| 2 | 1 | < 0.1% |
| 3.2 | 2 | < 0.1% |
| 3.4 | 1 | < 0.1% |
| 3.6 | 4 | < 0.1% |
| 4.4 | 4 | < 0.1% |
| 4.6 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 143.4 | 1 | < 0.1% |
| 137 | 1 | < 0.1% |
| 126.4 | 1 | < 0.1% |
| 126.2 | 3 | |
| 123.8 | 1 | < 0.1% |
| 122.8 | 1 | < 0.1% |
| 118 | 2 | |
| 117.8 | 1 | < 0.1% |
| 117.6 | 1 | < 0.1% |
| 114.6 | 1 | < 0.1% |
listening_1
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25236 |
| Missing (%) | 88.6% |
| Memory size | 1.0 MiB |
| 235 | |
| 164 | |
| 106 |
Length
| Max length | 30 |
|---|---|
| Median length | 27 |
| Mean length | 26.845918 |
| Min length | 22 |
Characters and Unicode
| Total characters | 87464 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2753 | 9.7% | |
| 235 | 0.8% | |
| 164 | 0.6% | |
| 106 | 0.4% | |
| (Missing) | 25236 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2917 | ||
| 2917 | ||
| 2753 | ||
| 2753 | ||
| 341 | 2.5% | |
| 235 | 1.7% | |
| 235 | 1.7% | |
| 235 | 1.7% | |
| 235 | 1.7% | |
| 235 | 1.7% | |
| Other values (6) | 694 | 5.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 11845 | ||
| 10292 | ||
| 6622 | 7.6% | |
| 6587 | 7.5% | |
| 6481 | 7.4% | |
| 6329 | 7.2% | |
| 3622 | 4.1% | |
| 3599 | 4.1% | |
| 3258 | 3.7% | |
| 3152 | 3.6% | |
| Other values (15) | 25677 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 67739 | |
| Space Separator | 10292 | 11.8% |
| Other Punctuation | 6175 | 7.1% |
| Uppercase Letter | 3258 | 3.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 11845 | ||
| 6622 | ||
| 6587 | ||
| 6481 | ||
| 6329 | ||
| 3622 | 5.3% | |
| 3599 | 5.3% | |
| 3152 | 4.7% | |
| 3129 | 4.6% | |
| 3023 | 4.5% | |
| Other values (10) | 13350 |
Other Punctuation
| Value | Count | Frequency (%) |
| 3258 | ||
| 2917 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 3023 | ||
| 235 | 7.2% |
Space Separator
| Value | Count | Frequency (%) |
| 10292 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 70997 | |
| Common | 16467 | 18.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 11845 | ||
| 6622 | 9.3% | |
| 6587 | 9.3% | |
| 6481 | 9.1% | |
| 6329 | 8.9% | |
| 3622 | 5.1% | |
| 3599 | 5.1% | |
| 3152 | 4.4% | |
| 3129 | 4.4% | |
| 3023 | 4.3% | |
| Other values (12) | 16608 |
Common
| Value | Count | Frequency (%) |
| 10292 | ||
| 3258 | 19.8% | |
| 2917 | 17.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 87464 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 11845 | ||
| 10292 | ||
| 6622 | 7.6% | |
| 6587 | 7.5% | |
| 6481 | 7.4% | |
| 6329 | 7.2% | |
| 3622 | 4.1% | |
| 3599 | 4.1% | |
| 3258 | 3.7% | |
| 3152 | 3.6% | |
| Other values (15) | 25677 |
listening_10
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25222 |
| Missing (%) | 88.5% |
| Memory size | 1.2 MiB |
Length
| Max length | 97 |
|---|---|
| Median length | 55 |
| Mean length | 66.081296 |
| Min length | 55 |
Characters and Unicode
| Total characters | 216218 |
|---|---|
| Distinct characters | 36 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1877 | 6.6% | |
| 652 | 2.3% | |
| 382 | 1.3% | |
| 361 | 1.3% | |
| (Missing) | 25222 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3272 | 10.6% | |
| 1877 | 6.1% | |
| 1877 | 6.1% | |
| 1877 | 6.1% | |
| 1877 | 6.1% | |
| 1877 | 6.1% | |
| 1877 | 6.1% | |
| 1034 | 3.4% | |
| 1013 | 3.3% | |
| 652 | 2.1% | |
| Other values (28) | 13498 |
Most occurring characters
| Value | Count | Frequency (%) |
| 27459 | 12.7% | |
| 20142 | 9.3% | |
| 14574 | 6.7% | |
| 14492 | 6.7% | |
| 13067 | 6.0% | |
| 11963 | 5.5% | |
| 11481 | 5.3% | |
| 11332 | 5.2% | |
| 10738 | 5.0% | |
| 8479 | 3.9% | |
| Other values (26) | 72491 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 170301 | |
| Space Separator | 27459 | 12.7% |
| Uppercase Letter | 7678 | 3.6% |
| Decimal Number | 7508 | 3.5% |
| Other Punctuation | 3272 | 1.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 20142 | ||
| 14574 | 8.6% | |
| 14492 | 8.5% | |
| 13067 | 7.7% | |
| 11963 | 7.0% | |
| 11481 | 6.7% | |
| 11332 | 6.7% | |
| 10738 | 6.3% | |
| 8479 | 5.0% | |
| 7515 | 4.4% | |
| Other values (14) | 46518 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1877 | ||
| 1877 | ||
| 1877 | ||
| 743 | 9.7% | |
| 652 | 8.5% | |
| 652 | 8.5% |
Decimal Number
| Value | Count | Frequency (%) |
| 1877 | ||
| 1877 | ||
| 1877 | ||
| 1877 |
Space Separator
| Value | Count | Frequency (%) |
| 27459 |
Other Punctuation
| Value | Count | Frequency (%) |
| 3272 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 177979 | |
| Common | 38239 | 17.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 20142 | 11.3% | |
| 14574 | 8.2% | |
| 14492 | 8.1% | |
| 13067 | 7.3% | |
| 11963 | 6.7% | |
| 11481 | 6.5% | |
| 11332 | 6.4% | |
| 10738 | 6.0% | |
| 8479 | 4.8% | |
| 7515 | 4.2% | |
| Other values (20) | 54196 |
Common
| Value | Count | Frequency (%) |
| 27459 | ||
| 3272 | 8.6% | |
| 1877 | 4.9% | |
| 1877 | 4.9% | |
| 1877 | 4.9% | |
| 1877 | 4.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 216218 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 27459 | 12.7% | |
| 20142 | 9.3% | |
| 14574 | 6.7% | |
| 14492 | 6.7% | |
| 13067 | 6.0% | |
| 11963 | 5.5% | |
| 11481 | 5.3% | |
| 11332 | 5.2% | |
| 10738 | 5.0% | |
| 8479 | 3.9% | |
| Other values (26) | 72491 |
listening_2
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25222 |
| Missing (%) | 88.5% |
| Memory size | 1.1 MiB |
| 177 |
Length
| Max length | 40 |
|---|---|
| Median length | 36 |
| Mean length | 33.526589 |
| Min length | 23 |
Characters and Unicode
| Total characters | 109699 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2159 | 7.6% | |
| 596 | 2.1% | |
| 340 | 1.2% | |
| 177 | 0.6% | |
| (Missing) | 25222 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2755 | ||
| 2336 | ||
| 2159 | ||
| 2159 | ||
| 2159 | ||
| 2159 | ||
| 2159 | ||
| 773 | 3.7% | |
| 596 | 2.8% | |
| 596 | 2.8% | |
| Other values (10) | 3167 |
Most occurring characters
| Value | Count | Frequency (%) |
| 17746 | ||
| 12236 | ||
| 9912 | 9.0% | |
| 8526 | 7.8% | |
| 7382 | 6.7% | |
| 6994 | 6.4% | |
| 5855 | 5.3% | |
| 5785 | 5.3% | |
| 4385 | 4.0% | |
| 3775 | 3.4% | |
| Other values (16) | 27103 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 85409 | |
| Space Separator | 17746 | 16.2% |
| Other Punctuation | 3272 | 3.0% |
| Uppercase Letter | 3272 | 3.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 12236 | ||
| 9912 | ||
| 8526 | ||
| 7382 | ||
| 6994 | ||
| 5855 | 6.9% | |
| 5785 | 6.8% | |
| 4385 | 5.1% | |
| 3775 | 4.4% | |
| 3095 | 3.6% | |
| Other values (10) | 17464 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2159 | ||
| 773 | 23.6% | |
| 340 | 10.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| 2499 | ||
| 773 | 23.6% |
Space Separator
| Value | Count | Frequency (%) |
| 17746 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 88681 | |
| Common | 21018 | 19.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 12236 | ||
| 9912 | ||
| 8526 | ||
| 7382 | 8.3% | |
| 6994 | 7.9% | |
| 5855 | 6.6% | |
| 5785 | 6.5% | |
| 4385 | 4.9% | |
| 3775 | 4.3% | |
| 3095 | 3.5% | |
| Other values (13) | 20736 |
Common
| Value | Count | Frequency (%) |
| 17746 | ||
| 2499 | 11.9% | |
| 773 | 3.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 109699 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 17746 | ||
| 12236 | ||
| 9912 | 9.0% | |
| 8526 | 7.8% | |
| 7382 | 6.7% | |
| 6994 | 6.4% | |
| 5855 | 5.3% | |
| 5785 | 5.3% | |
| 4385 | 4.0% | |
| 3775 | 3.4% | |
| Other values (16) | 27103 |
listening_3
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25240 |
| Missing (%) | 88.6% |
| Memory size | 1.1 MiB |
| 268 | |
| 58 | |
| 56 |
Length
| Max length | 54 |
|---|---|
| Median length | 39 |
| Mean length | 40.238476 |
| Min length | 34 |
Characters and Unicode
| Total characters | 130936 |
|---|---|
| Distinct characters | 27 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2872 | 10.1% | |
| 268 | 0.9% | |
| 58 | 0.2% | |
| 56 | 0.2% | |
| (Missing) | 25240 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2872 | ||
| 2872 | ||
| 2872 | ||
| 2872 | ||
| 2872 | ||
| 2872 | ||
| 498 | 2.4% | |
| 268 | 1.3% | |
| 268 | 1.3% | |
| 268 | 1.3% | |
| Other values (16) | 2178 |
Most occurring characters
| Value | Count | Frequency (%) |
| 20272 | ||
| 17726 | ||
| 11984 | 9.2% | |
| 9878 | 7.5% | |
| 9438 | 7.2% | |
| 9436 | 7.2% | |
| 6452 | 4.9% | |
| 6070 | 4.6% | |
| 4326 | 3.3% | |
| 4174 | 3.2% | |
| Other values (17) | 31180 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 103830 | |
| Space Separator | 17726 | 13.5% |
| Other Punctuation | 6126 | 4.7% |
| Uppercase Letter | 3254 | 2.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 20272 | ||
| 11984 | ||
| 9878 | ||
| 9438 | ||
| 9436 | ||
| 6452 | 6.2% | |
| 6070 | 5.8% | |
| 4326 | 4.2% | |
| 4174 | 4.0% | |
| 3752 | 3.6% | |
| Other values (11) | 18048 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2872 | ||
| 326 | 10.0% | |
| 56 | 1.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| 3254 | ||
| 2872 |
Space Separator
| Value | Count | Frequency (%) |
| 17726 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 107084 | |
| Common | 23852 | 18.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 20272 | ||
| 11984 | ||
| 9878 | ||
| 9438 | ||
| 9436 | ||
| 6452 | 6.0% | |
| 6070 | 5.7% | |
| 4326 | 4.0% | |
| 4174 | 3.9% | |
| 3752 | 3.5% | |
| Other values (14) | 21302 |
Common
| Value | Count | Frequency (%) |
| 17726 | ||
| 3254 | 13.6% | |
| 2872 | 12.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 130936 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 20272 | ||
| 17726 | ||
| 11984 | 9.2% | |
| 9878 | 7.5% | |
| 9438 | 7.2% | |
| 9436 | 7.2% | |
| 6452 | 4.9% | |
| 6070 | 4.6% | |
| 4326 | 3.3% | |
| 4174 | 3.2% | |
| Other values (17) | 31180 |
listening_4
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25219 |
| Missing (%) | 88.5% |
| Memory size | 1.1 MiB |
| 262 | |
| 174 | |
| 120 |
Length
| Max length | 52 |
|---|---|
| Median length | 47 |
| Mean length | 46.412519 |
| Min length | 39 |
Characters and Unicode
| Total characters | 152001 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2719 | 9.5% | |
| 262 | 0.9% | |
| 174 | 0.6% | |
| 120 | 0.4% | |
| (Missing) | 25219 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3275 | ||
| 2839 | ||
| 2719 | ||
| 2719 | ||
| 2719 | ||
| 2719 | ||
| 2719 | ||
| 2719 | ||
| 436 | 1.7% | |
| 262 | 1.0% | |
| Other values (15) | 2746 |
Most occurring characters
| Value | Count | Frequency (%) |
| 22597 | ||
| 18766 | ||
| 18386 | ||
| 9737 | 6.4% | |
| 9497 | 6.2% | |
| 9267 | 6.1% | |
| 8887 | 5.8% | |
| 8691 | 5.7% | |
| 6866 | 4.5% | |
| 6288 | 4.1% | |
| Other values (14) | 33019 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 122854 | |
| Space Separator | 22597 | 14.9% |
| Other Punctuation | 3275 | 2.2% |
| Uppercase Letter | 3275 | 2.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 18766 | ||
| 18386 | ||
| 9737 | ||
| 9497 | ||
| 9267 | ||
| 8887 | 7.2% | |
| 8691 | 7.1% | |
| 6866 | 5.6% | |
| 6288 | 5.1% | |
| 4093 | 3.3% | |
| Other values (11) | 22376 |
Space Separator
| Value | Count | Frequency (%) |
| 22597 |
Other Punctuation
| Value | Count | Frequency (%) |
| 3275 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 3275 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 126129 | |
| Common | 25872 | 17.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 18766 | ||
| 18386 | ||
| 9737 | 7.7% | |
| 9497 | 7.5% | |
| 9267 | 7.3% | |
| 8887 | 7.0% | |
| 8691 | 6.9% | |
| 6866 | 5.4% | |
| 6288 | 5.0% | |
| 4093 | 3.2% | |
| Other values (12) | 25651 |
Common
| Value | Count | Frequency (%) |
| 22597 | ||
| 3275 | 12.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 152001 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 22597 | ||
| 18766 | ||
| 18386 | ||
| 9737 | 6.4% | |
| 9497 | 6.2% | |
| 9267 | 6.1% | |
| 8887 | 5.8% | |
| 8691 | 5.7% | |
| 6866 | 4.5% | |
| 6288 | 4.1% | |
| Other values (14) | 33019 |
listening_5
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25238 |
| Missing (%) | 88.6% |
| Memory size | 1.1 MiB |
| 117 |
Length
| Max length | 44 |
|---|---|
| Median length | 34 |
| Mean length | 36.469287 |
| Min length | 30 |
Characters and Unicode
| Total characters | 118744 |
|---|---|
| Distinct characters | 28 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2137 | 7.5% | |
| 786 | 2.8% | |
| 216 | 0.8% | |
| 117 | 0.4% | |
| (Missing) | 25238 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2137 | ||
| 2137 | ||
| 2137 | ||
| 2137 | ||
| 2137 | ||
| 1002 | 5.0% | |
| 1002 | 5.0% | |
| 1002 | 5.0% | |
| 786 | 4.0% | |
| 786 | 4.0% | |
| Other values (13) | 4593 |
Most occurring characters
| Value | Count | Frequency (%) |
| 16600 | ||
| 12678 | 10.7% | |
| 11572 | 9.7% | |
| 9518 | 8.0% | |
| 6197 | 5.2% | |
| 5963 | 5.0% | |
| 5492 | 4.6% | |
| 5273 | 4.4% | |
| 5060 | 4.3% | |
| 4708 | 4.0% | |
| Other values (18) | 35683 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 92259 | |
| Space Separator | 16600 | 14.0% |
| Other Punctuation | 6629 | 5.6% |
| Uppercase Letter | 3256 | 2.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 12678 | ||
| 11572 | ||
| 9518 | ||
| 6197 | 6.7% | |
| 5963 | 6.5% | |
| 5492 | 6.0% | |
| 5273 | 5.7% | |
| 5060 | 5.5% | |
| 4708 | 5.1% | |
| 4706 | 5.1% | |
| Other values (10) | 21092 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2137 | ||
| 786 | 24.1% | |
| 216 | 6.6% | |
| 117 | 3.6% |
Other Punctuation
| Value | Count | Frequency (%) |
| 3256 | ||
| 2254 | ||
| 1119 | 16.9% |
Space Separator
| Value | Count | Frequency (%) |
| 16600 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 95515 | |
| Common | 23229 | 19.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 12678 | ||
| 11572 | ||
| 9518 | 10.0% | |
| 6197 | 6.5% | |
| 5963 | 6.2% | |
| 5492 | 5.7% | |
| 5273 | 5.5% | |
| 5060 | 5.3% | |
| 4708 | 4.9% | |
| 4706 | 4.9% | |
| Other values (14) | 24348 |
Common
| Value | Count | Frequency (%) |
| 16600 | ||
| 3256 | 14.0% | |
| 2254 | 9.7% | |
| 1119 | 4.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 118744 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 16600 | ||
| 12678 | 10.7% | |
| 11572 | 9.7% | |
| 9518 | 8.0% | |
| 6197 | 5.2% | |
| 5963 | 5.0% | |
| 5492 | 4.6% | |
| 5273 | 4.4% | |
| 5060 | 4.3% | |
| 4708 | 4.0% | |
| Other values (18) | 35683 |
listening_6
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25215 |
| Missing (%) | 88.5% |
| Memory size | 1.1 MiB |
| 77 |
Length
| Max length | 68 |
|---|---|
| Median length | 66 |
| Mean length | 63.130528 |
| Min length | 57 |
Characters and Unicode
| Total characters | 207005 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1727 | 6.1% | |
| 928 | 3.3% | |
| 547 | 1.9% | |
| 77 | 0.3% | |
| (Missing) | 25215 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3202 | 9.3% | |
| 3202 | 9.3% | |
| 2655 | 7.7% | |
| 1856 | 5.4% | |
| 1727 | 5.0% | |
| 1727 | 5.0% | |
| 1727 | 5.0% | |
| 1727 | 5.0% | |
| 1727 | 5.0% | |
| 1727 | 5.0% | |
| Other values (28) | 13142 |
Most occurring characters
| Value | Count | Frequency (%) |
| 31140 | ||
| 23457 | ||
| 23091 | ||
| 15578 | 7.5% | |
| 13518 | 6.5% | |
| 11641 | 5.6% | |
| 10307 | 5.0% | |
| 8171 | 3.9% | |
| 7806 | 3.8% | |
| 7130 | 3.4% | |
| Other values (20) | 55166 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 165011 | |
| Space Separator | 31140 | 15.0% |
| Other Punctuation | 5553 | 2.7% |
| Uppercase Letter | 5301 | 2.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 23457 | ||
| 23091 | ||
| 15578 | ||
| 13518 | 8.2% | |
| 11641 | 7.1% | |
| 10307 | 6.2% | |
| 8171 | 5.0% | |
| 7806 | 4.7% | |
| 7130 | 4.3% | |
| 6088 | 3.7% | |
| Other values (13) | 38224 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 3660 | ||
| 1094 | 20.6% | |
| 547 | 10.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| 3279 | ||
| 1727 | ||
| 547 | 9.9% |
Space Separator
| Value | Count | Frequency (%) |
| 31140 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 170312 | |
| Common | 36693 | 17.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 23457 | ||
| 23091 | ||
| 15578 | 9.1% | |
| 13518 | 7.9% | |
| 11641 | 6.8% | |
| 10307 | 6.1% | |
| 8171 | 4.8% | |
| 7806 | 4.6% | |
| 7130 | 4.2% | |
| 6088 | 3.6% | |
| Other values (16) | 43525 |
Common
| Value | Count | Frequency (%) |
| 31140 | ||
| 3279 | 8.9% | |
| 1727 | 4.7% | |
| 547 | 1.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 207005 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 31140 | ||
| 23457 | ||
| 23091 | ||
| 15578 | 7.5% | |
| 13518 | 6.5% | |
| 11641 | 5.6% | |
| 10307 | 5.0% | |
| 8171 | 3.9% | |
| 7806 | 3.8% | |
| 7130 | 3.4% | |
| Other values (20) | 55166 |
listening_7
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25210 |
| Missing (%) | 88.5% |
| Memory size | 1.1 MiB |
| 37 |
Length
| Max length | 49 |
|---|---|
| Median length | 39 |
| Mean length | 35.777101 |
| Min length | 23 |
Characters and Unicode
| Total characters | 117492 |
|---|---|
| Distinct characters | 27 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1298 | 4.6% | |
| 992 | 3.5% | |
| 957 | 3.4% | |
| 37 | 0.1% | |
| (Missing) | 25210 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2290 | ||
| 2255 | ||
| 1986 | 9.9% | |
| 1298 | 6.5% | |
| 1298 | 6.5% | |
| 1298 | 6.5% | |
| 1298 | 6.5% | |
| 1298 | 6.5% | |
| 1029 | 5.1% | |
| 992 | 4.9% | |
| Other values (9) | 5038 |
Most occurring characters
| Value | Count | Frequency (%) |
| 16796 | ||
| 10536 | 9.0% | |
| 9509 | 8.1% | |
| 7287 | 6.2% | |
| 7254 | 6.2% | |
| 6749 | 5.7% | |
| 6570 | 5.6% | |
| 6568 | 5.6% | |
| 5344 | 4.5% | |
| 5303 | 4.5% | |
| Other values (17) | 35576 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 94128 | |
| Space Separator | 16796 | 14.3% |
| Other Punctuation | 3284 | 2.8% |
| Uppercase Letter | 3284 | 2.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 10536 | ||
| 9509 | 10.1% | |
| 7287 | 7.7% | |
| 7254 | 7.7% | |
| 6749 | 7.2% | |
| 6570 | 7.0% | |
| 6568 | 7.0% | |
| 5344 | 5.7% | |
| 5303 | 5.6% | |
| 5233 | 5.6% | |
| Other values (12) | 23775 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2255 | ||
| 992 | ||
| 37 | 1.1% |
Space Separator
| Value | Count | Frequency (%) |
| 16796 |
Other Punctuation
| Value | Count | Frequency (%) |
| 3284 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 97412 | |
| Common | 20080 | 17.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 10536 | 10.8% | |
| 9509 | 9.8% | |
| 7287 | 7.5% | |
| 7254 | 7.4% | |
| 6749 | 6.9% | |
| 6570 | 6.7% | |
| 6568 | 6.7% | |
| 5344 | 5.5% | |
| 5303 | 5.4% | |
| 5233 | 5.4% | |
| Other values (15) | 27059 |
Common
| Value | Count | Frequency (%) |
| 16796 | ||
| 3284 | 16.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 117492 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 16796 | ||
| 10536 | 9.0% | |
| 9509 | 8.1% | |
| 7287 | 6.2% | |
| 7254 | 6.2% | |
| 6749 | 5.7% | |
| 6570 | 5.6% | |
| 6568 | 5.6% | |
| 5344 | 4.5% | |
| 5303 | 4.5% | |
| Other values (17) | 35576 |
listening_8
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25225 |
| Missing (%) | 88.5% |
| Memory size | 1.1 MiB |
| 91 | |
| 45 |
Length
| Max length | 54 |
|---|---|
| Median length | 40 |
| Mean length | 41.448455 |
| Min length | 40 |
Characters and Unicode
| Total characters | 135495 |
|---|---|
| Distinct characters | 33 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1795 | 6.3% | |
| 1338 | 4.7% | |
| 91 | 0.3% | |
| 45 | 0.2% | |
| (Missing) | 25225 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1931 | 7.9% | |
| 1886 | 7.7% | |
| 1886 | 7.7% | |
| 1795 | 7.3% | |
| 1795 | 7.3% | |
| 1795 | 7.3% | |
| 1795 | 7.3% | |
| 1383 | 5.6% | |
| 1338 | 5.4% | |
| 1338 | 5.4% | |
| Other values (20) | 7641 |
Most occurring characters
| Value | Count | Frequency (%) |
| 21314 | ||
| 13594 | 10.0% | |
| 10812 | 8.0% | |
| 8650 | 6.4% | |
| 8192 | 6.0% | |
| 7831 | 5.8% | |
| 7785 | 5.7% | |
| 5990 | 4.4% | |
| 5567 | 4.1% | |
| 4744 | 3.5% | |
| Other values (23) | 41016 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 101061 | |
| Space Separator | 21314 | 15.7% |
| Other Punctuation | 6582 | 4.9% |
| Uppercase Letter | 5200 | 3.8% |
| Dash Punctuation | 1338 | 1.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 13594 | ||
| 10812 | ||
| 8650 | 8.6% | |
| 8192 | 8.1% | |
| 7831 | 7.7% | |
| 7785 | 7.7% | |
| 5990 | 5.9% | |
| 5567 | 5.5% | |
| 4744 | 4.7% | |
| 4607 | 4.6% | |
| Other values (13) | 23289 |
Other Punctuation
| Value | Count | Frequency (%) |
| 3224 | ||
| 1840 | ||
| 1473 | ||
| 45 | 0.7% |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1976 | ||
| 1795 | ||
| 1338 | ||
| 91 | 1.8% |
Space Separator
| Value | Count | Frequency (%) |
| 21314 |
Dash Punctuation
| Value | Count | Frequency (%) |
| 1338 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 106261 | |
| Common | 29234 | 21.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 13594 | ||
| 10812 | 10.2% | |
| 8650 | 8.1% | |
| 8192 | 7.7% | |
| 7831 | 7.4% | |
| 7785 | 7.3% | |
| 5990 | 5.6% | |
| 5567 | 5.2% | |
| 4744 | 4.5% | |
| 4607 | 4.3% | |
| Other values (17) | 28489 |
Common
| Value | Count | Frequency (%) |
| 21314 | ||
| 3224 | 11.0% | |
| 1840 | 6.3% | |
| 1473 | 5.0% | |
| 1338 | 4.6% | |
| 45 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 135495 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 21314 | ||
| 13594 | 10.0% | |
| 10812 | 8.0% | |
| 8650 | 6.4% | |
| 8192 | 6.0% | |
| 7831 | 5.8% | |
| 7785 | 5.7% | |
| 5990 | 4.4% | |
| 5567 | 4.1% | |
| 4744 | 3.5% | |
| Other values (23) | 41016 |
listening_9
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25213 |
| Missing (%) | 88.5% |
| Memory size | 1.2 MiB |
| 38 | |
| 36 |
Length
| Max length | 75 |
|---|---|
| Median length | 75 |
| Mean length | 73.511429 |
| Min length | 47 |
Characters and Unicode
| Total characters | 241191 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2623 | 9.2% | |
| 584 | 2.0% | |
| 38 | 0.1% | |
| 36 | 0.1% | |
| (Missing) | 25213 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3827 | 8.5% | |
| 3281 | 7.3% | |
| 2623 | 5.8% | |
| 2623 | 5.8% | |
| 2623 | 5.8% | |
| 2623 | 5.8% | |
| 2623 | 5.8% | |
| 2623 | 5.8% | |
| 2623 | 5.8% | |
| 2623 | 5.8% | |
| Other values (26) | 16850 |
Most occurring characters
| Value | Count | Frequency (%) |
| 41661 | ||
| 26064 | ||
| 25433 | ||
| 17674 | 7.3% | |
| 16949 | 7.0% | |
| 16185 | 6.7% | |
| 13634 | 5.7% | |
| 10564 | 4.4% | |
| 9695 | 4.0% | |
| 9149 | 3.8% | |
| Other values (21) | 54183 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 192272 | |
| Space Separator | 41661 | 17.3% |
| Other Punctuation | 3941 | 1.6% |
| Uppercase Letter | 3317 | 1.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 26064 | ||
| 25433 | ||
| 17674 | ||
| 16949 | ||
| 16185 | ||
| 13634 | 7.1% | |
| 10564 | 5.5% | |
| 9695 | 5.0% | |
| 9149 | 4.8% | |
| 7108 | 3.7% | |
| Other values (13) | 39817 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2623 | ||
| 584 | 17.6% | |
| 74 | 2.2% | |
| 36 | 1.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 3281 | ||
| 622 | 15.8% | |
| 38 | 1.0% |
Space Separator
| Value | Count | Frequency (%) |
| 41661 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 195589 | |
| Common | 45602 | 18.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 26064 | ||
| 25433 | ||
| 17674 | ||
| 16949 | 8.7% | |
| 16185 | 8.3% | |
| 13634 | 7.0% | |
| 10564 | 5.4% | |
| 9695 | 5.0% | |
| 9149 | 4.7% | |
| 7108 | 3.6% | |
| Other values (17) | 43134 |
Common
| Value | Count | Frequency (%) |
| 41661 | ||
| 3281 | 7.2% | |
| 622 | 1.4% | |
| 38 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 241191 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 41661 | ||
| 26064 | ||
| 25433 | ||
| 17674 | 7.3% | |
| 16949 | 7.0% | |
| 16185 | 6.7% | |
| 13634 | 5.7% | |
| 10564 | 4.4% | |
| 9695 | 4.0% | |
| 9149 | 3.8% | |
| Other values (21) | 54183 |
percent
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 11232 |
|---|---|
| Missing (%) | 39.4% |
| Memory size | 1.0 MiB |
readingComprehension_1
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25001 |
| Missing (%) | 87.7% |
| Memory size | 1005.5 KiB |
| 8 |
Length
| Max length | 11 |
|---|---|
| Median length | 7 |
| Mean length | 8.7054108 |
| Min length | 6 |
Characters and Unicode
| Total characters | 30408 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1919 | 6.7% | |
| 1267 | 4.4% | |
| 299 | 1.0% | |
| 8 | < 0.1% | |
| (Missing) | 25001 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1919 | ||
| 1267 | ||
| 1267 | ||
| 1267 | ||
| 299 | 4.7% | |
| 299 | 4.7% | |
| 8 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 4453 | ||
| 3838 | ||
| 3485 | ||
| 3431 | ||
| 3186 | ||
| 2833 | ||
| 2534 | ||
| 2234 | ||
| 1919 | ||
| 1275 | 4.2% | |
| Other values (5) | 1220 | 4.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 27575 | |
| Space Separator | 2833 | 9.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4453 | ||
| 3838 | ||
| 3485 | ||
| 3431 | ||
| 3186 | ||
| 2534 | ||
| 2234 | ||
| 1919 | ||
| 1275 | 4.6% | |
| 598 | 2.2% | |
| Other values (4) | 622 | 2.3% |
Space Separator
| Value | Count | Frequency (%) |
| 2833 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 27575 | |
| Common | 2833 | 9.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4453 | ||
| 3838 | ||
| 3485 | ||
| 3431 | ||
| 3186 | ||
| 2534 | ||
| 2234 | ||
| 1919 | ||
| 1275 | 4.6% | |
| 598 | 2.2% | |
| Other values (4) | 622 | 2.3% |
Common
| Value | Count | Frequency (%) |
| 2833 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 30408 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4453 | ||
| 3838 | ||
| 3485 | ||
| 3431 | ||
| 3186 | ||
| 2833 | ||
| 2534 | ||
| 2234 | ||
| 1919 | ||
| 1275 | 4.2% | |
| Other values (5) | 1220 | 4.0% |
readingComprehension_10
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25140 |
| Missing (%) | 88.2% |
| Memory size | 1.1 MiB |
Length
| Max length | 64 |
|---|---|
| Median length | 64 |
| Mean length | 59.881038 |
| Min length | 49 |
Characters and Unicode
| Total characters | 200841 |
|---|---|
| Distinct characters | 22 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2011 | 7.1% | |
| 807 | 2.8% | |
| 281 | 1.0% | |
| 255 | 0.9% | |
| (Missing) | 25140 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3073 | 9.5% | |
| 2818 | 8.7% | |
| 2547 | 7.9% | |
| 2292 | 7.1% | |
| 2011 | 6.2% | |
| 2011 | 6.2% | |
| 2011 | 6.2% | |
| 2011 | 6.2% | |
| 2011 | 6.2% | |
| 2011 | 6.2% | |
| Other values (19) | 9646 |
Most occurring characters
| Value | Count | Frequency (%) |
| 29088 | ||
| 23749 | ||
| 22664 | ||
| 18500 | ||
| 15295 | 7.6% | |
| 12609 | 6.3% | |
| 12047 | 6.0% | |
| 10168 | 5.1% | |
| 7912 | 3.9% | |
| 6718 | 3.3% | |
| Other values (12) | 42091 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 171753 | |
| Space Separator | 29088 | 14.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 23749 | ||
| 22664 | ||
| 18500 | ||
| 15295 | ||
| 12609 | 7.3% | |
| 12047 | 7.0% | |
| 10168 | 5.9% | |
| 7912 | 4.6% | |
| 6718 | 3.9% | |
| 6453 | 3.8% | |
| Other values (11) | 35638 |
Space Separator
| Value | Count | Frequency (%) |
| 29088 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 171753 | |
| Common | 29088 | 14.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 23749 | ||
| 22664 | ||
| 18500 | ||
| 15295 | ||
| 12609 | 7.3% | |
| 12047 | 7.0% | |
| 10168 | 5.9% | |
| 7912 | 4.6% | |
| 6718 | 3.9% | |
| 6453 | 3.8% | |
| Other values (11) | 35638 |
Common
| Value | Count | Frequency (%) |
| 29088 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 200841 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 29088 | ||
| 23749 | ||
| 22664 | ||
| 18500 | ||
| 15295 | 7.6% | |
| 12609 | 6.3% | |
| 12047 | 6.0% | |
| 10168 | 5.1% | |
| 7912 | 3.9% | |
| 6718 | 3.3% | |
| Other values (12) | 42091 |
readingComprehension_11
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25377 |
| Missing (%) | 89.1% |
| Memory size | 1.0 MiB |
Length
| Max length | 44 |
|---|---|
| Median length | 20 |
| Mean length | 25.665704 |
| Min length | 20 |
Characters and Unicode
| Total characters | 80000 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2091 | 7.3% | |
| 545 | 1.9% | |
| 252 | 0.9% | |
| 229 | 0.8% | |
| (Missing) | 25377 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3117 | ||
| 2343 | ||
| 2091 | ||
| 2091 | ||
| 1548 | ||
| 774 | 5.0% | |
| 774 | 5.0% | |
| 545 | 3.5% | |
| 545 | 3.5% | |
| 545 | 3.5% | |
| Other values (5) | 1191 | 7.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 13805 | ||
| 12447 | ||
| 11444 | ||
| 7324 | ||
| 6756 | ||
| 5185 | 6.5% | |
| 4436 | 5.5% | |
| 3117 | 3.9% | |
| 3117 | 3.9% | |
| 2091 | 2.6% | |
| Other values (14) | 10278 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 61319 | |
| Space Separator | 12447 | 15.6% |
| Uppercase Letter | 3117 | 3.9% |
| Other Punctuation | 3117 | 3.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 13805 | ||
| 11444 | ||
| 7324 | ||
| 6756 | ||
| 5185 | 8.5% | |
| 4436 | 7.2% | |
| 2091 | 3.4% | |
| 2029 | 3.3% | |
| 1548 | 2.5% | |
| 1342 | 2.2% | |
| Other values (11) | 5359 | 8.7% |
Space Separator
| Value | Count | Frequency (%) |
| 12447 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 3117 |
Other Punctuation
| Value | Count | Frequency (%) |
| 3117 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 64436 | |
| Common | 15564 | 19.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 13805 | ||
| 11444 | ||
| 7324 | ||
| 6756 | ||
| 5185 | 8.0% | |
| 4436 | 6.9% | |
| 3117 | 4.8% | |
| 2091 | 3.2% | |
| 2029 | 3.1% | |
| 1548 | 2.4% | |
| Other values (12) | 6701 |
Common
| Value | Count | Frequency (%) |
| 12447 | ||
| 3117 | 20.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 80000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 13805 | ||
| 12447 | ||
| 11444 | ||
| 7324 | ||
| 6756 | ||
| 5185 | 6.5% | |
| 4436 | 5.5% | |
| 3117 | 3.9% | |
| 3117 | 3.9% | |
| 2091 | 2.6% | |
| Other values (14) | 10278 |
readingComprehension_2
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25023 |
| Missing (%) | 87.8% |
| Memory size | 1005.8 KiB |
| 198 | |
| 91 | |
| 26 |
Length
| Max length | 11 |
|---|---|
| Median length | 9 |
| Mean length | 8.9841544 |
| Min length | 8 |
Characters and Unicode
| Total characters | 31184 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 3156 | 11.1% | |
| 198 | 0.7% | |
| 91 | 0.3% | |
| 26 | 0.1% | |
| (Missing) | 25023 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3156 | ||
| 198 | 5.7% | |
| 91 | 2.6% | |
| 26 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 9900 | ||
| 3471 | 11.1% | |
| 3364 | 10.8% | |
| 3354 | 10.8% | |
| 3273 | 10.5% | |
| 3247 | 10.4% | |
| 3156 | 10.1% | |
| 289 | 0.9% | |
| 224 | 0.7% | |
| 198 | 0.6% | |
| Other values (7) | 708 | 2.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 31184 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 9900 | ||
| 3471 | 11.1% | |
| 3364 | 10.8% | |
| 3354 | 10.8% | |
| 3273 | 10.5% | |
| 3247 | 10.4% | |
| 3156 | 10.1% | |
| 289 | 0.9% | |
| 224 | 0.7% | |
| 198 | 0.6% | |
| Other values (7) | 708 | 2.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 31184 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 9900 | ||
| 3471 | 11.1% | |
| 3364 | 10.8% | |
| 3354 | 10.8% | |
| 3273 | 10.5% | |
| 3247 | 10.4% | |
| 3156 | 10.1% | |
| 289 | 0.9% | |
| 224 | 0.7% | |
| 198 | 0.6% | |
| Other values (7) | 708 | 2.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 31184 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 9900 | ||
| 3471 | 11.1% | |
| 3364 | 10.8% | |
| 3354 | 10.8% | |
| 3273 | 10.5% | |
| 3247 | 10.4% | |
| 3156 | 10.1% | |
| 289 | 0.9% | |
| 224 | 0.7% | |
| 198 | 0.6% | |
| Other values (7) | 708 | 2.3% |
readingComprehension_3
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25000 |
| Missing (%) | 87.7% |
| Memory size | 1016.0 KiB |
| 145 | |
| 101 | |
| 81 |
Length
| Max length | 12 |
|---|---|
| Median length | 12 |
| Mean length | 11.763595 |
| Min length | 7 |
Characters and Unicode
| Total characters | 41102 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 3167 | 11.1% | |
| 145 | 0.5% | |
| 101 | 0.4% | |
| 81 | 0.3% | |
| (Missing) | 25000 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3494 | ||
| 3167 | ||
| 3167 | ||
| 182 | 1.8% | |
| 145 | 1.4% | |
| 101 | 1.0% | |
| 81 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 6843 | ||
| 6334 | ||
| 6334 | ||
| 3514 | ||
| 3494 | ||
| 3494 | ||
| 3494 | ||
| 3349 | ||
| 3167 | ||
| 283 | 0.7% | |
| Other values (7) | 796 | 1.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 34259 | |
| Space Separator | 6843 | 16.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 6334 | ||
| 6334 | ||
| 3514 | ||
| 3494 | ||
| 3494 | ||
| 3494 | ||
| 3349 | ||
| 3167 | ||
| 283 | 0.8% | |
| 182 | 0.5% | |
| Other values (6) | 614 | 1.8% |
Space Separator
| Value | Count | Frequency (%) |
| 6843 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 34259 | |
| Common | 6843 | 16.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 6334 | ||
| 6334 | ||
| 3514 | ||
| 3494 | ||
| 3494 | ||
| 3494 | ||
| 3349 | ||
| 3167 | ||
| 283 | 0.8% | |
| 182 | 0.5% | |
| Other values (6) | 614 | 1.8% |
Common
| Value | Count | Frequency (%) |
| 6843 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 41102 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6843 | ||
| 6334 | ||
| 6334 | ||
| 3514 | ||
| 3494 | ||
| 3494 | ||
| 3494 | ||
| 3349 | ||
| 3167 | ||
| 283 | 0.7% | |
| Other values (7) | 796 | 1.9% |
readingComprehension_4
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25007 |
| Missing (%) | 87.8% |
| Memory size | 1002.4 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 8 |
| Mean length | 7.8554631 |
| Min length | 6 |
Characters and Unicode
| Total characters | 27392 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1423 | 5.0% | |
| 890 | 3.1% | |
| 785 | 2.8% | |
| 389 | 1.4% | |
| (Missing) | 25007 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1423 | ||
| 890 | ||
| 785 | ||
| 389 | 11.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 5299 | ||
| 3487 | ||
| 3487 | ||
| 3098 | ||
| 2846 | ||
| 1812 | 6.6% | |
| 1423 | 5.2% | |
| 1423 | 5.2% | |
| 890 | 3.2% | |
| 890 | 3.2% | |
| Other values (5) | 2737 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 27392 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 5299 | ||
| 3487 | ||
| 3487 | ||
| 3098 | ||
| 2846 | ||
| 1812 | 6.6% | |
| 1423 | 5.2% | |
| 1423 | 5.2% | |
| 890 | 3.2% | |
| 890 | 3.2% | |
| Other values (5) | 2737 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 27392 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 5299 | ||
| 3487 | ||
| 3487 | ||
| 3098 | ||
| 2846 | ||
| 1812 | 6.6% | |
| 1423 | 5.2% | |
| 1423 | 5.2% | |
| 890 | 3.2% | |
| 890 | 3.2% | |
| Other values (5) | 2737 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 27392 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5299 | ||
| 3487 | ||
| 3487 | ||
| 3098 | ||
| 2846 | ||
| 1812 | 6.6% | |
| 1423 | 5.2% | |
| 1423 | 5.2% | |
| 890 | 3.2% | |
| 890 | 3.2% | |
| Other values (5) | 2737 |
readingComprehension_5
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25012 |
| Missing (%) | 87.8% |
| Memory size | 1012.6 KiB |
| 226 | |
| 207 | |
| 20 |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 10.875359 |
| Min length | 9 |
Characters and Unicode
| Total characters | 37868 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 3029 | 10.6% | |
| 226 | 0.8% | |
| 207 | 0.7% | |
| 20 | 0.1% | |
| (Missing) | 25012 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3029 | ||
| 226 | 6.5% | |
| 207 | 5.9% | |
| 20 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 9787 | ||
| 3688 | 9.7% | |
| 3501 | 9.2% | |
| 3482 | 9.2% | |
| 3275 | 8.6% | |
| 3236 | 8.5% | |
| 3236 | 8.5% | |
| 3049 | 8.1% | |
| 3029 | 8.0% | |
| 640 | 1.7% | |
| Other values (6) | 945 | 2.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 37868 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 9787 | ||
| 3688 | 9.7% | |
| 3501 | 9.2% | |
| 3482 | 9.2% | |
| 3275 | 8.6% | |
| 3236 | 8.5% | |
| 3236 | 8.5% | |
| 3049 | 8.1% | |
| 3029 | 8.0% | |
| 640 | 1.7% | |
| Other values (6) | 945 | 2.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 37868 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 9787 | ||
| 3688 | 9.7% | |
| 3501 | 9.2% | |
| 3482 | 9.2% | |
| 3275 | 8.6% | |
| 3236 | 8.5% | |
| 3236 | 8.5% | |
| 3049 | 8.1% | |
| 3029 | 8.0% | |
| 640 | 1.7% | |
| Other values (6) | 945 | 2.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 37868 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 9787 | ||
| 3688 | 9.7% | |
| 3501 | 9.2% | |
| 3482 | 9.2% | |
| 3275 | 8.6% | |
| 3236 | 8.5% | |
| 3236 | 8.5% | |
| 3049 | 8.1% | |
| 3029 | 8.0% | |
| 640 | 1.7% | |
| Other values (6) | 945 | 2.5% |
readingComprehension_6
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25018 |
| Missing (%) | 87.8% |
| Memory size | 1001.7 KiB |
| 160 | |
| 159 |
Length
| Max length | 10 |
|---|---|
| Median length | 7 |
| Mean length | 7.7459724 |
| Min length | 7 |
Characters and Unicode
| Total characters | 26925 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2452 | 8.6% | |
| 705 | 2.5% | |
| 160 | 0.6% | |
| 159 | 0.6% | |
| (Missing) | 25018 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2452 | ||
| 705 | 20.3% | |
| 160 | 4.6% | |
| 159 | 4.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 5063 | ||
| 4182 | ||
| 3476 | ||
| 3157 | ||
| 2771 | ||
| 2611 | ||
| 1410 | 5.2% | |
| 705 | 2.6% | |
| 705 | 2.6% | |
| 705 | 2.6% | |
| Other values (7) | 2140 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 26925 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 5063 | ||
| 4182 | ||
| 3476 | ||
| 3157 | ||
| 2771 | ||
| 2611 | ||
| 1410 | 5.2% | |
| 705 | 2.6% | |
| 705 | 2.6% | |
| 705 | 2.6% | |
| Other values (7) | 2140 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 26925 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 5063 | ||
| 4182 | ||
| 3476 | ||
| 3157 | ||
| 2771 | ||
| 2611 | ||
| 1410 | 5.2% | |
| 705 | 2.6% | |
| 705 | 2.6% | |
| 705 | 2.6% | |
| Other values (7) | 2140 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 26925 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5063 | ||
| 4182 | ||
| 3476 | ||
| 3157 | ||
| 2771 | ||
| 2611 | ||
| 1410 | 5.2% | |
| 705 | 2.6% | |
| 705 | 2.6% | |
| 705 | 2.6% | |
| Other values (7) | 2140 |
readingComprehension_7
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25000 |
| Missing (%) | 87.7% |
| Memory size | 1.0 MiB |
| 46 | |
| 43 |
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 18.381225 |
| Min length | 11 |
Characters and Unicode
| Total characters | 64224 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1875 | 6.6% | |
| 1530 | 5.4% | |
| 46 | 0.2% | |
| 43 | 0.2% | |
| (Missing) | 25000 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1921 | ||
| 1875 | ||
| 1875 | ||
| 1875 | ||
| 1573 | ||
| 1530 | ||
| 1530 | ||
| 46 | 0.4% | |
| 46 | 0.4% | |
| 43 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 8863 | ||
| 8518 | ||
| 5717 | ||
| 5671 | ||
| 5671 | ||
| 5369 | ||
| 4978 | ||
| 3497 | 5.4% | |
| 3494 | 5.4% | |
| 3451 | 5.4% | |
| Other values (9) | 8995 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 55361 | |
| Space Separator | 8863 | 13.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 8518 | ||
| 5717 | ||
| 5671 | ||
| 5671 | ||
| 5369 | ||
| 4978 | ||
| 3497 | ||
| 3494 | ||
| 3451 | ||
| 1921 | 3.5% | |
| Other values (8) | 7074 |
Space Separator
| Value | Count | Frequency (%) |
| 8863 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 55361 | |
| Common | 8863 | 13.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 8518 | ||
| 5717 | ||
| 5671 | ||
| 5671 | ||
| 5369 | ||
| 4978 | ||
| 3497 | ||
| 3494 | ||
| 3451 | ||
| 1921 | 3.5% | |
| Other values (8) | 7074 |
Common
| Value | Count | Frequency (%) |
| 8863 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 64224 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 8863 | ||
| 8518 | ||
| 5717 | ||
| 5671 | ||
| 5671 | ||
| 5369 | ||
| 4978 | ||
| 3497 | 5.4% | |
| 3494 | 5.4% | |
| 3451 | 5.4% | |
| Other values (9) | 8995 |
readingComprehension_8
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25030 |
| Missing (%) | 87.8% |
| Memory size | 1007.6 KiB |
| 111 |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.5840069 |
| Min length | 6 |
Characters and Unicode
| Total characters | 33199 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2574 | 9.0% | |
| 502 | 1.8% | |
| 277 | 1.0% | |
| 111 | 0.4% | |
| (Missing) | 25030 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2574 | ||
| 502 | 14.5% | |
| 277 | 8.0% | |
| 111 | 3.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 7722 | ||
| 5650 | ||
| 3076 | 9.3% | |
| 3076 | 9.3% | |
| 3076 | 9.3% | |
| 2851 | 8.6% | |
| 2574 | 7.8% | |
| 1503 | 4.5% | |
| 1001 | 3.0% | |
| 890 | 2.7% | |
| Other values (5) | 1780 | 5.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 33199 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 7722 | ||
| 5650 | ||
| 3076 | 9.3% | |
| 3076 | 9.3% | |
| 3076 | 9.3% | |
| 2851 | 8.6% | |
| 2574 | 7.8% | |
| 1503 | 4.5% | |
| 1001 | 3.0% | |
| 890 | 2.7% | |
| Other values (5) | 1780 | 5.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 33199 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 7722 | ||
| 5650 | ||
| 3076 | 9.3% | |
| 3076 | 9.3% | |
| 3076 | 9.3% | |
| 2851 | 8.6% | |
| 2574 | 7.8% | |
| 1503 | 4.5% | |
| 1001 | 3.0% | |
| 890 | 2.7% | |
| Other values (5) | 1780 | 5.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 33199 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 7722 | ||
| 5650 | ||
| 3076 | 9.3% | |
| 3076 | 9.3% | |
| 3076 | 9.3% | |
| 2851 | 8.6% | |
| 2574 | 7.8% | |
| 1503 | 4.5% | |
| 1001 | 3.0% | |
| 890 | 2.7% | |
| Other values (5) | 1780 | 5.4% |
readingComprehension_9
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25014 |
| Missing (%) | 87.8% |
| Memory size | 1009.0 KiB |
| 105 |
Length
| Max length | 12 |
|---|---|
| Median length | 9 |
| Mean length | 9.8436782 |
| Min length | 4 |
Characters and Unicode
| Total characters | 34256 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1790 | 6.3% | |
| 1397 | 4.9% | |
| 188 | 0.7% | |
| 105 | 0.4% | |
| (Missing) | 25014 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1790 | ||
| 1397 | ||
| 188 | 5.4% | |
| 105 | 3.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 7483 | ||
| 4977 | ||
| 3563 | ||
| 3480 | ||
| 3292 | ||
| 3187 | ||
| 1895 | 5.5% | |
| 1790 | 5.2% | |
| 1397 | 4.1% | |
| 1397 | 4.1% | |
| Other values (4) | 1795 | 5.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 34256 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 7483 | ||
| 4977 | ||
| 3563 | ||
| 3480 | ||
| 3292 | ||
| 3187 | ||
| 1895 | 5.5% | |
| 1790 | 5.2% | |
| 1397 | 4.1% | |
| 1397 | 4.1% | |
| Other values (4) | 1795 | 5.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 34256 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 7483 | ||
| 4977 | ||
| 3563 | ||
| 3480 | ||
| 3292 | ||
| 3187 | ||
| 1895 | 5.5% | |
| 1790 | 5.2% | |
| 1397 | 4.1% | |
| 1397 | 4.1% | |
| Other values (4) | 1795 | 5.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 34256 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 7483 | ||
| 4977 | ||
| 3563 | ||
| 3480 | ||
| 3292 | ||
| 3187 | ||
| 1895 | 5.5% | |
| 1790 | 5.2% | |
| 1397 | 4.1% | |
| 1397 | 4.1% | |
| Other values (4) | 1795 | 5.2% |
score
Real number (ℝ)
| Distinct | 34 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 11229 |
| Missing (%) | 39.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.793223 |
| Minimum | 0 |
|---|---|
| Maximum | 33 |
| Zeros | 25 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 222.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 7 |
| median | 9 |
| Q3 | 13 |
| 95-th percentile | 27 |
| Maximum | 33 |
| Range | 33 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 6.1324478 |
|---|---|
| Coefficient of variation (CV) | 0.56817576 |
| Kurtosis | 2.2937556 |
| Mean | 10.793223 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 1.5992262 |
| Sum | 186345 |
| Variance | 37.606917 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8 | 2040 | 7.2% |
| 7 | 1896 | 6.7% |
| 9 | 1558 | 5.5% |
| 6 | 1538 | 5.4% |
| 13 | 1472 | 5.2% |
| 12 | 1289 | 4.5% |
| 14 | 1181 | 4.1% |
| 10 | 1120 | 3.9% |
| 11 | 1057 | 3.7% |
| 5 | 990 | 3.5% |
| Other values (24) | 3124 | 11.0% |
| (Missing) | 11229 |
| Value | Count | Frequency (%) |
| 0 | 25 | 0.1% |
| 1 | 55 | 0.2% |
| 2 | 121 | 0.4% |
| 3 | 272 | 1.0% |
| 4 | 599 | 2.1% |
| 5 | 990 | |
| 6 | 1538 | |
| 7 | 1896 | |
| 8 | 2040 | |
| 9 | 1558 |
| Value | Count | Frequency (%) |
| 33 | 14 | < 0.1% |
| 32 | 16 | 0.1% |
| 31 | 49 | 0.2% |
| 30 | 124 | |
| 29 | 193 | |
| 28 | 252 | |
| 27 | 258 | |
| 26 | 250 | |
| 25 | 180 | |
| 24 | 155 |
scoreBreakdown.pickIncorrect
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 28494 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 222.7 KiB |
scoreBreakdown.tenses
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 28494 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 222.7 KiB |
scoreBreakdown.wordSelection
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 28494 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 222.7 KiB |
situationalJudgement_1
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 25837 |
| Missing (%) | 90.7% |
| Memory size | 1.5 MiB |
| 210 | |
| 46 | |
| 13 |
Length
| Max length | 233 |
|---|---|
| Median length | 233 |
| Mean length | 219.28039 |
| Min length | 4 |
Characters and Unicode
| Total characters | 582628 |
|---|---|
| Distinct characters | 32 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2388 | 8.4% | |
| 210 | 0.7% | |
| 46 | 0.2% | |
| 13 | < 0.1% | |
| (Missing) | 25837 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 15050 | 15.4% | |
| 5032 | 5.2% | |
| 4986 | 5.1% | |
| 4776 | 4.9% | |
| 2808 | 2.9% | |
| 2598 | 2.7% | |
| 2598 | 2.7% | |
| 2434 | 2.5% | |
| 2434 | 2.5% | |
| 2434 | 2.5% | |
| Other values (41) | 52439 |
Most occurring characters
| Value | Count | Frequency (%) |
| 95188 | ||
| 59813 | ||
| 57642 | ||
| 46980 | 8.1% | |
| 40059 | 6.9% | |
| 35073 | 6.0% | |
| 32580 | 5.6% | |
| 30310 | 5.2% | |
| 28106 | 4.8% | |
| 26708 | 4.6% | |
| Other values (22) | 130169 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 472495 | |
| Space Separator | 95188 | 16.3% |
| Other Punctuation | 9900 | 1.7% |
| Uppercase Letter | 5045 | 0.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 59813 | ||
| 57642 | ||
| 46980 | ||
| 40059 | ||
| 35073 | 7.4% | |
| 32580 | 6.9% | |
| 30310 | 6.4% | |
| 28106 | 5.9% | |
| 26708 | 5.7% | |
| 22214 | 4.7% | |
| Other values (13) | 93010 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2388 | ||
| 2388 | ||
| 210 | 4.2% | |
| 46 | 0.9% | |
| 13 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| 5032 | ||
| 2480 | ||
| 2388 |
Space Separator
| Value | Count | Frequency (%) |
| 95188 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 477540 | |
| Common | 105088 | 18.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 59813 | ||
| 57642 | ||
| 46980 | ||
| 40059 | ||
| 35073 | 7.3% | |
| 32580 | 6.8% | |
| 30310 | 6.3% | |
| 28106 | 5.9% | |
| 26708 | 5.6% | |
| 22214 | 4.7% | |
| Other values (18) | 98055 |
Common
| Value | Count | Frequency (%) |
| 95188 | ||
| 5032 | 4.8% | |
| 2480 | 2.4% | |
| 2388 | 2.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 582628 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 95188 | ||
| 59813 | ||
| 57642 | ||
| 46980 | 8.1% | |
| 40059 | 6.9% | |
| 35073 | 6.0% | |
| 32580 | 5.6% | |
| 30310 | 5.2% | |
| 28106 | 4.8% | |
| 26708 | 4.6% | |
| Other values (22) | 130169 |
situationalJudgement_10
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26378 |
| Missing (%) | 92.6% |
| Memory size | 1.3 MiB |
| 94 | |
| 59 | |
| 10 |
Length
| Max length | 219 |
|---|---|
| Median length | 219 |
| Mean length | 207.94234 |
| Min length | 7 |
Characters and Unicode
| Total characters | 440006 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1953 | 6.9% | |
| 94 | 0.3% | |
| 59 | 0.2% | |
| 10 | < 0.1% | |
| (Missing) | 26378 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 6151 | 8.9% | |
| 4059 | 5.9% | |
| 4010 | 5.8% | |
| 3906 | 5.7% | |
| 2057 | 3.0% | |
| 2047 | 3.0% | |
| 2047 | 3.0% | |
| 1963 | 2.9% | |
| 1963 | 2.9% | |
| 1953 | 2.8% | |
| Other values (39) | 38611 |
Most occurring characters
| Value | Count | Frequency (%) |
| 66651 | ||
| 42003 | 9.5% | |
| 36163 | 8.2% | |
| 35758 | 8.1% | |
| 32238 | 7.3% | |
| 30379 | 6.9% | |
| 20277 | 4.6% | |
| 20030 | 4.6% | |
| 18087 | 4.1% | |
| 17918 | 4.1% | |
| Other values (21) | 120502 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 365217 | |
| Space Separator | 66651 | 15.1% |
| Uppercase Letter | 4128 | 0.9% |
| Other Punctuation | 4010 | 0.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 42003 | ||
| 36163 | 9.9% | |
| 35758 | 9.8% | |
| 32238 | 8.8% | |
| 30379 | 8.3% | |
| 20277 | 5.6% | |
| 20030 | 5.5% | |
| 18087 | 5.0% | |
| 17918 | 4.9% | |
| 15916 | 4.4% | |
| Other values (14) | 96448 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1953 | ||
| 1953 | ||
| 104 | 2.5% | |
| 59 | 1.4% | |
| 59 | 1.4% |
Space Separator
| Value | Count | Frequency (%) |
| 66651 |
Other Punctuation
| Value | Count | Frequency (%) |
| 4010 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 369345 | |
| Common | 70661 | 16.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 42003 | ||
| 36163 | 9.8% | |
| 35758 | 9.7% | |
| 32238 | 8.7% | |
| 30379 | 8.2% | |
| 20277 | 5.5% | |
| 20030 | 5.4% | |
| 18087 | 4.9% | |
| 17918 | 4.9% | |
| 15916 | 4.3% | |
| Other values (19) | 100576 |
Common
| Value | Count | Frequency (%) |
| 66651 | ||
| 4010 | 5.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 440006 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 66651 | ||
| 42003 | 9.5% | |
| 36163 | 8.2% | |
| 35758 | 8.1% | |
| 32238 | 7.3% | |
| 30379 | 6.9% | |
| 20277 | 4.6% | |
| 20030 | 4.6% | |
| 18087 | 4.1% | |
| 17918 | 4.1% | |
| Other values (21) | 120502 |
situationalJudgement_11
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 25957 |
| Missing (%) | 91.1% |
| Memory size | 1.4 MiB |
| 107 | |
| 59 | |
| 28 |
Length
| Max length | 196 |
|---|---|
| Median length | 196 |
| Mean length | 184.60741 |
| Min length | 18 |
Characters and Unicode
| Total characters | 468349 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2343 | 8.2% | |
| 107 | 0.4% | |
| 59 | 0.2% | |
| 28 | 0.1% | |
| (Missing) | 25957 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 4987 | 5.6% | |
| 4714 | 5.3% | |
| 4714 | 5.3% | |
| 4686 | 5.3% | |
| 4686 | 5.3% | |
| 4686 | 5.3% | |
| 2478 | 2.8% | |
| 2478 | 2.8% | |
| 2450 | 2.8% | |
| 2343 | 2.7% | |
| Other values (38) | 50088 |
Most occurring characters
| Value | Count | Frequency (%) |
| 85880 | ||
| 45583 | 9.7% | |
| 40784 | 8.7% | |
| 28529 | 6.1% | |
| 26614 | 5.7% | |
| 26313 | 5.6% | |
| 26122 | 5.6% | |
| 23759 | 5.1% | |
| 21385 | 4.6% | |
| 18907 | 4.0% | |
| Other values (21) | 124473 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 370366 | |
| Space Separator | 85880 | 18.3% |
| Other Punctuation | 7223 | 1.5% |
| Uppercase Letter | 4880 | 1.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 45583 | ||
| 40784 | ||
| 28529 | 7.7% | |
| 26614 | 7.2% | |
| 26313 | 7.1% | |
| 26122 | 7.1% | |
| 23759 | 6.4% | |
| 21385 | 5.8% | |
| 18907 | 5.1% | |
| 16429 | 4.4% | |
| Other values (13) | 95941 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2343 | ||
| 2343 | ||
| 107 | 2.2% | |
| 59 | 1.2% | |
| 28 | 0.6% |
Other Punctuation
| Value | Count | Frequency (%) |
| 4880 | ||
| 2343 |
Space Separator
| Value | Count | Frequency (%) |
| 85880 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 375246 | |
| Common | 93103 | 19.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 45583 | ||
| 40784 | ||
| 28529 | 7.6% | |
| 26614 | 7.1% | |
| 26313 | 7.0% | |
| 26122 | 7.0% | |
| 23759 | 6.3% | |
| 21385 | 5.7% | |
| 18907 | 5.0% | |
| 16429 | 4.4% | |
| Other values (18) | 100821 |
Common
| Value | Count | Frequency (%) |
| 85880 | ||
| 4880 | 5.2% | |
| 2343 | 2.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 468349 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 85880 | ||
| 45583 | 9.7% | |
| 40784 | 8.7% | |
| 28529 | 6.1% | |
| 26614 | 5.7% | |
| 26313 | 5.6% | |
| 26122 | 5.6% | |
| 23759 | 5.1% | |
| 21385 | 4.6% | |
| 18907 | 4.0% | |
| Other values (21) | 124473 |
situationalJudgement_12
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26433 |
| Missing (%) | 92.8% |
| Memory size | 1.0 MiB |
| 93 |
Length
| Max length | 97 |
|---|---|
| Median length | 79 |
| Mean length | 52.043668 |
| Min length | 7 |
Characters and Unicode
| Total characters | 107262 |
|---|---|
| Distinct characters | 29 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 975 | 3.4% | |
| 891 | 3.1% | |
| 102 | 0.4% | |
| 93 | 0.3% | |
| (Missing) | 26433 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2172 | 9.4% | |
| 1884 | 8.2% | |
| 1875 | 8.2% | |
| 1086 | 4.7% | |
| 975 | 4.2% | |
| 975 | 4.2% | |
| 975 | 4.2% | |
| 891 | 3.9% | |
| 891 | 3.9% | |
| 891 | 3.9% | |
| Other values (24) | 10386 |
Most occurring characters
| Value | Count | Frequency (%) |
| 20940 | ||
| 13746 | ||
| 11166 | ||
| 10368 | ||
| 6822 | 6.4% | |
| 3870 | 3.6% | |
| 3861 | 3.6% | |
| 3861 | 3.6% | |
| 3537 | 3.3% | |
| 3453 | 3.2% | |
| Other values (19) | 25638 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 82107 | |
| Space Separator | 20940 | 19.5% |
| Uppercase Letter | 3036 | 2.8% |
| Other Punctuation | 1179 | 1.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 13746 | ||
| 11166 | ||
| 10368 | ||
| 6822 | 8.3% | |
| 3870 | 4.7% | |
| 3861 | 4.7% | |
| 3861 | 4.7% | |
| 3537 | 4.3% | |
| 3453 | 4.2% | |
| 3360 | 4.1% | |
| Other values (12) | 18063 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1077 | ||
| 975 | ||
| 891 | ||
| 93 | 3.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 1086 | ||
| 93 | 7.9% |
Space Separator
| Value | Count | Frequency (%) |
| 20940 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 85143 | |
| Common | 22119 | 20.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 13746 | ||
| 11166 | ||
| 10368 | ||
| 6822 | 8.0% | |
| 3870 | 4.5% | |
| 3861 | 4.5% | |
| 3861 | 4.5% | |
| 3537 | 4.2% | |
| 3453 | 4.1% | |
| 3360 | 3.9% | |
| Other values (16) | 21099 |
Common
| Value | Count | Frequency (%) |
| 20940 | ||
| 1086 | 4.9% | |
| 93 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 107262 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 20940 | ||
| 13746 | ||
| 11166 | ||
| 10368 | ||
| 6822 | 6.4% | |
| 3870 | 3.6% | |
| 3861 | 3.6% | |
| 3861 | 3.6% | |
| 3537 | 3.3% | |
| 3453 | 3.2% | |
| Other values (19) | 25638 |
situationalJudgement_13
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 25968 |
| Missing (%) | 91.1% |
| Memory size | 1.4 MiB |
| 29 | |
| 9 | |
| 8 |
Length
| Max length | 181 |
|---|---|
| Median length | 181 |
| Mean length | 178.72367 |
| Min length | 18 |
Characters and Unicode
| Total characters | 451456 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2480 | 8.7% | |
| 29 | 0.1% | |
| 9 | < 0.1% | |
| 8 | < 0.1% | |
| (Missing) | 25968 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 7495 | 9.7% | |
| 4960 | 6.4% | |
| 4960 | 6.4% | |
| 4960 | 6.4% | |
| 4960 | 6.4% | |
| 2509 | 3.2% | |
| 2488 | 3.2% | |
| 2480 | 3.2% | |
| 2480 | 3.2% | |
| 2480 | 3.2% | |
| Other values (33) | 37629 |
Most occurring characters
| Value | Count | Frequency (%) |
| 74875 | ||
| 49921 | ||
| 32516 | 7.2% | |
| 32436 | 7.2% | |
| 27384 | 6.1% | |
| 27327 | 6.1% | |
| 24884 | 5.5% | |
| 24867 | 5.5% | |
| 20040 | 4.4% | |
| 19877 | 4.4% | |
| Other values (20) | 117329 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 359129 | |
| Space Separator | 74875 | 16.6% |
| Other Punctuation | 12446 | 2.8% |
| Uppercase Letter | 5006 | 1.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 49921 | ||
| 32516 | 9.1% | |
| 32436 | 9.0% | |
| 27384 | 7.6% | |
| 27327 | 7.6% | |
| 24884 | 6.9% | |
| 24867 | 6.9% | |
| 20040 | 5.6% | |
| 19877 | 5.5% | |
| 17494 | 4.9% | |
| Other values (11) | 82383 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2480 | ||
| 2480 | ||
| 29 | 0.6% | |
| 9 | 0.2% | |
| 8 | 0.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| 5006 | ||
| 4960 | ||
| 2480 |
Space Separator
| Value | Count | Frequency (%) |
| 74875 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 364135 | |
| Common | 87321 | 19.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 49921 | ||
| 32516 | 8.9% | |
| 32436 | 8.9% | |
| 27384 | 7.5% | |
| 27327 | 7.5% | |
| 24884 | 6.8% | |
| 24867 | 6.8% | |
| 20040 | 5.5% | |
| 19877 | 5.5% | |
| 17494 | 4.8% | |
| Other values (16) | 87389 |
Common
| Value | Count | Frequency (%) |
| 74875 | ||
| 5006 | 5.7% | |
| 4960 | 5.7% | |
| 2480 | 2.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 451456 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 74875 | ||
| 49921 | ||
| 32516 | 7.2% | |
| 32436 | 7.2% | |
| 27384 | 6.1% | |
| 27327 | 6.1% | |
| 24884 | 5.5% | |
| 24867 | 5.5% | |
| 20040 | 4.4% | |
| 19877 | 4.4% | |
| Other values (20) | 117329 |
situationalJudgement_14
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25826 |
| Missing (%) | 90.6% |
| Memory size | 1.6 MiB |
| 24 | |
| 18 | |
| 7 |
Length
| Max length | 247 |
|---|---|
| Median length | 247 |
| Mean length | 243.15817 |
| Min length | 7 |
Characters and Unicode
| Total characters | 648746 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2619 | 9.2% | |
| 24 | 0.1% | |
| 18 | 0.1% | |
| 7 | < 0.1% | |
| (Missing) | 25826 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 13126 | 11.6% | |
| 5262 | 4.7% | |
| 5262 | 4.7% | |
| 5256 | 4.7% | |
| 5238 | 4.6% | |
| 2643 | 2.3% | |
| 2643 | 2.3% | |
| 2626 | 2.3% | |
| 2626 | 2.3% | |
| 2619 | 2.3% | |
| Other values (37) | 65700 |
Most occurring characters
| Value | Count | Frequency (%) |
| 115571 | ||
| 68266 | ||
| 57838 | 8.9% | |
| 44640 | 6.9% | |
| 44586 | 6.9% | |
| 36687 | 5.7% | |
| 34198 | 5.3% | |
| 34096 | 5.3% | |
| 31562 | 4.9% | |
| 21031 | 3.2% | |
| Other values (20) | 160271 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 519951 | |
| Space Separator | 115571 | 17.8% |
| Other Punctuation | 7919 | 1.2% |
| Uppercase Letter | 5305 | 0.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 68266 | ||
| 57838 | ||
| 44640 | 8.6% | |
| 44586 | 8.6% | |
| 36687 | 7.1% | |
| 34198 | 6.6% | |
| 34096 | 6.6% | |
| 31562 | 6.1% | |
| 21031 | 4.0% | |
| 21021 | 4.0% | |
| Other values (12) | 126026 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2637 | ||
| 2619 | ||
| 24 | 0.5% | |
| 18 | 0.3% | |
| 7 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 5269 | ||
| 2650 |
Space Separator
| Value | Count | Frequency (%) |
| 115571 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 525256 | |
| Common | 123490 | 19.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 68266 | ||
| 57838 | ||
| 44640 | 8.5% | |
| 44586 | 8.5% | |
| 36687 | 7.0% | |
| 34198 | 6.5% | |
| 34096 | 6.5% | |
| 31562 | 6.0% | |
| 21031 | 4.0% | |
| 21021 | 4.0% | |
| Other values (17) | 131331 |
Common
| Value | Count | Frequency (%) |
| 115571 | ||
| 5269 | 4.3% | |
| 2650 | 2.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 648746 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 115571 | ||
| 68266 | ||
| 57838 | 8.9% | |
| 44640 | 6.9% | |
| 44586 | 6.9% | |
| 36687 | 5.7% | |
| 34198 | 5.3% | |
| 34096 | 5.3% | |
| 31562 | 4.9% | |
| 21031 | 3.2% | |
| Other values (20) | 160271 |
situationalJudgement_15
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26387 |
| Missing (%) | 92.6% |
| Memory size | 1.4 MiB |
| 22 | |
| 9 | |
| 7 |
Length
| Max length | 227 |
|---|---|
| Median length | 227 |
| Mean length | 225.04746 |
| Min length | 18 |
Characters and Unicode
| Total characters | 474175 |
|---|---|
| Distinct characters | 33 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2069 | 7.3% | |
| 22 | 0.1% | |
| 9 | < 0.1% | |
| 7 | < 0.1% | |
| (Missing) | 26387 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 16599 | ||
| 6216 | 7.8% | |
| 4147 | 5.2% | |
| 4147 | 5.2% | |
| 2091 | 2.6% | |
| 2091 | 2.6% | |
| 2091 | 2.6% | |
| 2091 | 2.6% | |
| 2078 | 2.6% | |
| 2069 | 2.6% | |
| Other values (51) | 35835 |
Most occurring characters
| Value | Count | Frequency (%) |
| 79417 | ||
| 52224 | ||
| 45949 | 9.7% | |
| 29128 | 6.1% | |
| 27087 | 5.7% | |
| 25176 | 5.3% | |
| 24983 | 5.3% | |
| 20980 | 4.4% | |
| 20878 | 4.4% | |
| 20779 | 4.4% | |
| Other values (23) | 127574 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 380111 | |
| Space Separator | 79417 | 16.7% |
| Other Punctuation | 8380 | 1.8% |
| Uppercase Letter | 6267 | 1.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 52224 | ||
| 45949 | ||
| 29128 | 7.7% | |
| 27087 | 7.1% | |
| 25176 | 6.6% | |
| 24983 | 6.6% | |
| 20980 | 5.5% | |
| 20878 | 5.5% | |
| 20779 | 5.5% | |
| 16738 | 4.4% | |
| Other values (13) | 96189 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2091 | ||
| 2069 | ||
| 2069 | ||
| 22 | 0.4% | |
| 9 | 0.1% | |
| 7 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 6267 | ||
| 2091 | 25.0% | |
| 22 | 0.3% |
Space Separator
| Value | Count | Frequency (%) |
| 79417 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 386378 | |
| Common | 87797 | 18.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 52224 | ||
| 45949 | ||
| 29128 | 7.5% | |
| 27087 | 7.0% | |
| 25176 | 6.5% | |
| 24983 | 6.5% | |
| 20980 | 5.4% | |
| 20878 | 5.4% | |
| 20779 | 5.4% | |
| 16738 | 4.3% | |
| Other values (19) | 102456 |
Common
| Value | Count | Frequency (%) |
| 79417 | ||
| 6267 | 7.1% | |
| 2091 | 2.4% | |
| 22 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 474175 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 79417 | ||
| 52224 | ||
| 45949 | 9.7% | |
| 29128 | 6.1% | |
| 27087 | 5.7% | |
| 25176 | 5.3% | |
| 24983 | 5.3% | |
| 20980 | 4.4% | |
| 20878 | 4.4% | |
| 20779 | 4.4% | |
| Other values (23) | 127574 |
situationalJudgement_2
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26363 |
| Missing (%) | 92.5% |
| Memory size | 1.2 MiB |
| 67 | |
| 19 | |
| 17 |
Length
| Max length | 123 |
|---|---|
| Median length | 123 |
| Mean length | 118.71422 |
| Min length | 7 |
Characters and Unicode
| Total characters | 252980 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2028 | 7.1% | |
| 67 | 0.2% | |
| 19 | 0.1% | |
| 17 | 0.1% | |
| (Missing) | 26363 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 4142 | 9.6% | |
| 4126 | 9.5% | |
| 4056 | 9.4% | |
| 2114 | 4.9% | |
| 2045 | 4.7% | |
| 2045 | 4.7% | |
| 2028 | 4.7% | |
| 2028 | 4.7% | |
| 2028 | 4.7% | |
| 2028 | 4.7% | |
| Other values (32) | 16725 |
Most occurring characters
| Value | Count | Frequency (%) |
| 41234 | ||
| 26724 | ||
| 20388 | 8.1% | |
| 16523 | 6.5% | |
| 14428 | 5.7% | |
| 14393 | 5.7% | |
| 14337 | 5.7% | |
| 14266 | 5.6% | |
| 12380 | 4.9% | |
| 12364 | 4.9% | |
| Other values (21) | 65943 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 207467 | |
| Space Separator | 41234 | 16.3% |
| Uppercase Letter | 2198 | 0.9% |
| Other Punctuation | 2081 | 0.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 26724 | ||
| 20388 | ||
| 16523 | 8.0% | |
| 14428 | 7.0% | |
| 14393 | 6.9% | |
| 14337 | 6.9% | |
| 14266 | 6.9% | |
| 12380 | 6.0% | |
| 12364 | 6.0% | |
| 12312 | 5.9% | |
| Other values (14) | 49352 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2095 | ||
| 67 | 3.0% | |
| 19 | 0.9% | |
| 17 | 0.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| 2064 | ||
| 17 | 0.8% |
Space Separator
| Value | Count | Frequency (%) |
| 41234 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 209665 | |
| Common | 43315 | 17.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 26724 | ||
| 20388 | 9.7% | |
| 16523 | 7.9% | |
| 14428 | 6.9% | |
| 14393 | 6.9% | |
| 14337 | 6.8% | |
| 14266 | 6.8% | |
| 12380 | 5.9% | |
| 12364 | 5.9% | |
| 12312 | 5.9% | |
| Other values (18) | 51550 |
Common
| Value | Count | Frequency (%) |
| 41234 | ||
| 2064 | 4.8% | |
| 17 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 252980 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 41234 | ||
| 26724 | ||
| 20388 | 8.1% | |
| 16523 | 6.5% | |
| 14428 | 5.7% | |
| 14393 | 5.7% | |
| 14337 | 5.7% | |
| 14266 | 5.6% | |
| 12380 | 4.9% | |
| 12364 | 4.9% | |
| Other values (21) | 65943 |
situationalJudgement_3
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26420 |
| Missing (%) | 92.7% |
| Memory size | 1.1 MiB |
| 135 | |
| 30 |
Length
| Max length | 129 |
|---|---|
| Median length | 110 |
| Mean length | 89.832208 |
| Min length | 7 |
Characters and Unicode
| Total characters | 186312 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1508 | 5.3% | |
| 401 | 1.4% | |
| 135 | 0.5% | |
| 30 | 0.1% | |
| (Missing) | 26420 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3016 | 8.7% | |
| 1913 | 5.5% | |
| 1778 | 5.1% | |
| 1643 | 4.8% | |
| 1643 | 4.8% | |
| 1508 | 4.4% | |
| 1508 | 4.4% | |
| 1508 | 4.4% | |
| 1508 | 4.4% | |
| 1508 | 4.4% | |
| Other values (26) | 17025 |
Most occurring characters
| Value | Count | Frequency (%) |
| 32619 | ||
| 19851 | 10.7% | |
| 18613 | 10.0% | |
| 13009 | 7.0% | |
| 10829 | 5.8% | |
| 10424 | 5.6% | |
| 8620 | 4.6% | |
| 8350 | 4.5% | |
| 6842 | 3.7% | |
| 6707 | 3.6% | |
| Other values (21) | 50448 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 147902 | |
| Space Separator | 32619 | 17.5% |
| Other Punctuation | 3286 | 1.8% |
| Uppercase Letter | 2505 | 1.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 19851 | ||
| 18613 | ||
| 13009 | 8.8% | |
| 10829 | 7.3% | |
| 10424 | 7.0% | |
| 8620 | 5.8% | |
| 8350 | 5.6% | |
| 6842 | 4.6% | |
| 6707 | 4.5% | |
| 6572 | 4.4% | |
| Other values (14) | 38085 |
Other Punctuation
| Value | Count | Frequency (%) |
| 1643 | ||
| 1508 | ||
| 135 | 4.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1538 | ||
| 566 | 22.6% | |
| 401 | 16.0% |
Space Separator
| Value | Count | Frequency (%) |
| 32619 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 150407 | |
| Common | 35905 | 19.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 19851 | ||
| 18613 | ||
| 13009 | 8.6% | |
| 10829 | 7.2% | |
| 10424 | 6.9% | |
| 8620 | 5.7% | |
| 8350 | 5.6% | |
| 6842 | 4.5% | |
| 6707 | 4.5% | |
| 6572 | 4.4% | |
| Other values (17) | 40590 |
Common
| Value | Count | Frequency (%) |
| 32619 | ||
| 1643 | 4.6% | |
| 1508 | 4.2% | |
| 135 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 186312 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 32619 | ||
| 19851 | 10.7% | |
| 18613 | 10.0% | |
| 13009 | 7.0% | |
| 10829 | 5.8% | |
| 10424 | 5.6% | |
| 8620 | 4.6% | |
| 8350 | 4.5% | |
| 6842 | 3.7% | |
| 6707 | 3.6% | |
| Other values (21) | 50448 |
situationalJudgement_4
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26402 |
| Missing (%) | 92.7% |
| Memory size | 1.2 MiB |
| 37 | |
| 22 | |
| 17 |
Length
| Max length | 139 |
|---|---|
| Median length | 139 |
| Mean length | 136.46558 |
| Min length | 7 |
Characters and Unicode
| Total characters | 285486 |
|---|---|
| Distinct characters | 32 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2016 | 7.1% | |
| 37 | 0.1% | |
| 22 | 0.1% | |
| 17 | 0.1% | |
| (Missing) | 26402 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 6107 | 11.9% | |
| 4069 | 7.9% | |
| 4032 | 7.8% | |
| 2075 | 4.0% | |
| 2070 | 4.0% | |
| 2053 | 4.0% | |
| 2016 | 3.9% | |
| 2016 | 3.9% | |
| 2016 | 3.9% | |
| 2016 | 3.9% | |
| Other values (33) | 22927 |
Most occurring characters
| Value | Count | Frequency (%) |
| 49342 | ||
| 30580 | ||
| 28586 | ||
| 22605 | 7.9% | |
| 18479 | 6.5% | |
| 18395 | 6.4% | |
| 14395 | 5.0% | |
| 14267 | 5.0% | |
| 14245 | 5.0% | |
| 10250 | 3.6% | |
| Other values (22) | 64342 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 225838 | |
| Space Separator | 49342 | 17.3% |
| Other Punctuation | 6107 | 2.1% |
| Uppercase Letter | 4125 | 1.4% |
| Decimal Number | 74 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 30580 | ||
| 28586 | ||
| 22605 | ||
| 18479 | ||
| 18395 | ||
| 14395 | 6.4% | |
| 14267 | 6.3% | |
| 14245 | 6.3% | |
| 10250 | 4.5% | |
| 10208 | 4.5% | |
| Other values (12) | 43828 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2038 | ||
| 2016 | ||
| 37 | 0.9% | |
| 17 | 0.4% | |
| 17 | 0.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| 4091 | ||
| 2016 |
Decimal Number
| Value | Count | Frequency (%) |
| 37 | ||
| 37 |
Space Separator
| Value | Count | Frequency (%) |
| 49342 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 229963 | |
| Common | 55523 | 19.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 30580 | ||
| 28586 | ||
| 22605 | ||
| 18479 | 8.0% | |
| 18395 | 8.0% | |
| 14395 | 6.3% | |
| 14267 | 6.2% | |
| 14245 | 6.2% | |
| 10250 | 4.5% | |
| 10208 | 4.4% | |
| Other values (17) | 47953 |
Common
| Value | Count | Frequency (%) |
| 49342 | ||
| 4091 | 7.4% | |
| 2016 | 3.6% | |
| 37 | 0.1% | |
| 37 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 285486 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 49342 | ||
| 30580 | ||
| 28586 | ||
| 22605 | 7.9% | |
| 18479 | 6.5% | |
| 18395 | 6.4% | |
| 14395 | 5.0% | |
| 14267 | 5.0% | |
| 14245 | 5.0% | |
| 10250 | 3.6% | |
| Other values (22) | 64342 |
situationalJudgement_5
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26382 |
| Missing (%) | 92.6% |
| Memory size | 1.2 MiB |
| 60 | |
| 10 |
Length
| Max length | 153 |
|---|---|
| Median length | 153 |
| Mean length | 139.75284 |
| Min length | 18 |
Characters and Unicode
| Total characters | 295158 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1676 | 5.9% | |
| 366 | 1.3% | |
| 60 | 0.2% | |
| 10 | < 0.1% | |
| (Missing) | 26382 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 5780 | 10.3% | |
| 3788 | 6.8% | |
| 3718 | 6.6% | |
| 3352 | 6.0% | |
| 2408 | 4.3% | |
| 2052 | 3.7% | |
| 2042 | 3.6% | |
| 2042 | 3.6% | |
| 2042 | 3.6% | |
| 2042 | 3.6% | |
| Other values (36) | 26722 |
Most occurring characters
| Value | Count | Frequency (%) |
| 54618 | ||
| 33844 | ||
| 30340 | ||
| 25392 | 8.6% | |
| 17868 | 6.1% | |
| 15002 | 5.1% | |
| 13682 | 4.6% | |
| 11926 | 4.0% | |
| 11550 | 3.9% | |
| 10330 | 3.5% | |
| Other values (20) | 70606 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 226096 | |
| Space Separator | 54618 | 18.5% |
| Other Punctuation | 9924 | 3.4% |
| Uppercase Letter | 4520 | 1.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 33844 | ||
| 30340 | ||
| 25392 | ||
| 17868 | 7.9% | |
| 15002 | 6.6% | |
| 13682 | 6.1% | |
| 11926 | 5.3% | |
| 11550 | 5.1% | |
| 10330 | 4.6% | |
| 9884 | 4.4% | |
| Other values (12) | 46278 |
Other Punctuation
| Value | Count | Frequency (%) |
| 3718 | ||
| 2408 | ||
| 2112 | ||
| 1686 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2418 | ||
| 2042 | ||
| 60 | 1.3% |
Space Separator
| Value | Count | Frequency (%) |
| 54618 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 230616 | |
| Common | 64542 | 21.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 33844 | ||
| 30340 | ||
| 25392 | ||
| 17868 | 7.7% | |
| 15002 | 6.5% | |
| 13682 | 5.9% | |
| 11926 | 5.2% | |
| 11550 | 5.0% | |
| 10330 | 4.5% | |
| 9884 | 4.3% | |
| Other values (15) | 50798 |
Common
| Value | Count | Frequency (%) |
| 54618 | ||
| 3718 | 5.8% | |
| 2408 | 3.7% | |
| 2112 | 3.3% | |
| 1686 | 2.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 295158 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 54618 | ||
| 33844 | ||
| 30340 | ||
| 25392 | 8.6% | |
| 17868 | 6.1% | |
| 15002 | 5.1% | |
| 13682 | 4.6% | |
| 11926 | 4.0% | |
| 11550 | 3.9% | |
| 10330 | 3.5% | |
| Other values (20) | 70606 |
situationalJudgement_6
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26395 |
| Missing (%) | 92.6% |
| Memory size | 1.2 MiB |
| 12 | |
| 1 |
Length
| Max length | 201 |
|---|---|
| Median length | 201 |
| Mean length | 162.32206 |
| Min length | 7 |
Characters and Unicode
| Total characters | 340714 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Common Values
| Value | Count | Frequency (%) |
| 1154 | 4.0% | |
| 705 | 2.5% | |
| 227 | 0.8% | |
| 12 | < 0.1% | |
| 1 | < 0.1% | |
| (Missing) | 26395 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 10936 | ||
| 3945 | 7.2% | |
| 3731 | 6.8% | |
| 3026 | 5.5% | |
| 2308 | 4.2% | |
| 1859 | 3.4% | |
| 1859 | 3.4% | |
| 1410 | 2.6% | |
| 1167 | 2.1% | |
| 1154 | 2.1% | |
| Other values (30) | 23700 |
Most occurring characters
| Value | Count | Frequency (%) |
| 53009 | ||
| 46440 | ||
| 28861 | 8.5% | |
| 20895 | 6.1% | |
| 20219 | 5.9% | |
| 19078 | 5.6% | |
| 18848 | 5.5% | |
| 18755 | 5.5% | |
| 16514 | 4.8% | |
| 14906 | 4.4% | |
| Other values (21) | 83189 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 279032 | |
| Space Separator | 53009 | 15.6% |
| Other Punctuation | 4436 | 1.3% |
| Uppercase Letter | 4237 | 1.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 46440 | ||
| 28861 | ||
| 20895 | 7.5% | |
| 20219 | 7.2% | |
| 19078 | 6.8% | |
| 18848 | 6.8% | |
| 18755 | 6.7% | |
| 16514 | 5.9% | |
| 14906 | 5.3% | |
| 13501 | 4.8% | |
| Other values (13) | 61015 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2603 | ||
| 1167 | ||
| 227 | 5.4% | |
| 227 | 5.4% | |
| 13 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| 3731 | ||
| 705 | 15.9% |
Space Separator
| Value | Count | Frequency (%) |
| 53009 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 283269 | |
| Common | 57445 | 16.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 46440 | ||
| 28861 | ||
| 20895 | 7.4% | |
| 20219 | 7.1% | |
| 19078 | 6.7% | |
| 18848 | 6.7% | |
| 18755 | 6.6% | |
| 16514 | 5.8% | |
| 14906 | 5.3% | |
| 13501 | 4.8% | |
| Other values (18) | 65252 |
Common
| Value | Count | Frequency (%) |
| 53009 | ||
| 3731 | 6.5% | |
| 705 | 1.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 340714 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 53009 | ||
| 46440 | ||
| 28861 | 8.5% | |
| 20895 | 6.1% | |
| 20219 | 5.9% | |
| 19078 | 5.6% | |
| 18848 | 5.5% | |
| 18755 | 5.5% | |
| 16514 | 4.8% | |
| 14906 | 4.4% | |
| Other values (21) | 83189 |
situationalJudgement_7
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26382 |
| Missing (%) | 92.6% |
| Memory size | 1.5 MiB |
| 157 | |
| 21 | |
| 20 |
Length
| Max length | 280 |
|---|---|
| Median length | 280 |
| Mean length | 264.79261 |
| Min length | 29 |
Characters and Unicode
| Total characters | 559242 |
|---|---|
| Distinct characters | 32 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1914 | 6.7% | |
| 157 | 0.6% | |
| 21 | 0.1% | |
| 20 | 0.1% | |
| (Missing) | 26382 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 9988 | 10.6% | |
| 5742 | 6.1% | |
| 4027 | 4.3% | |
| 4006 | 4.3% | |
| 2249 | 2.4% | |
| 2092 | 2.2% | |
| 2092 | 2.2% | |
| 2071 | 2.2% | |
| 2071 | 2.2% | |
| 2071 | 2.2% | |
| Other values (55) | 57764 |
Most occurring characters
| Value | Count | Frequency (%) |
| 94132 | ||
| 64481 | ||
| 49911 | 8.9% | |
| 35026 | 6.3% | |
| 34391 | 6.1% | |
| 29683 | 5.3% | |
| 29619 | 5.3% | |
| 27800 | 5.0% | |
| 24272 | 4.3% | |
| 22142 | 4.0% | |
| Other values (22) | 147785 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 449245 | |
| Space Separator | 94132 | 16.8% |
| Other Punctuation | 8011 | 1.4% |
| Uppercase Letter | 7854 | 1.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 64481 | ||
| 49911 | ||
| 35026 | 7.8% | |
| 34391 | 7.7% | |
| 29683 | 6.6% | |
| 29619 | 6.6% | |
| 27800 | 6.2% | |
| 24272 | 5.4% | |
| 22142 | 4.9% | |
| 19831 | 4.4% | |
| Other values (14) | 112089 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 3985 | ||
| 1914 | ||
| 1914 | ||
| 21 | 0.3% | |
| 20 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| 5940 | ||
| 2071 | 25.9% |
Space Separator
| Value | Count | Frequency (%) |
| 94132 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 457099 | |
| Common | 102143 | 18.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 64481 | ||
| 49911 | ||
| 35026 | 7.7% | |
| 34391 | 7.5% | |
| 29683 | 6.5% | |
| 29619 | 6.5% | |
| 27800 | 6.1% | |
| 24272 | 5.3% | |
| 22142 | 4.8% | |
| 19831 | 4.3% | |
| Other values (19) | 119943 |
Common
| Value | Count | Frequency (%) |
| 94132 | ||
| 5940 | 5.8% | |
| 2071 | 2.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 559242 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 94132 | ||
| 64481 | ||
| 49911 | 8.9% | |
| 35026 | 6.3% | |
| 34391 | 6.1% | |
| 29683 | 5.3% | |
| 29619 | 5.3% | |
| 27800 | 5.0% | |
| 24272 | 4.3% | |
| 22142 | 4.0% | |
| Other values (22) | 147785 |
situationalJudgement_8
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 26406 |
| Missing (%) | 92.7% |
| Memory size | 1.1 MiB |
Length
| Max length | 130 |
|---|---|
| Median length | 98 |
| Mean length | 76.713602 |
| Min length | 7 |
Characters and Unicode
| Total characters | 160178 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 935 | 3.3% | |
| 548 | 1.9% | |
| 403 | 1.4% | |
| 202 | 0.7% | |
| (Missing) | 26406 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2878 | 9.4% | |
| 1943 | 6.3% | |
| 1886 | 6.1% | |
| 1741 | 5.7% | |
| 1556 | 5.1% | |
| 1338 | 4.3% | |
| 1338 | 4.3% | |
| 1338 | 4.3% | |
| 1338 | 4.3% | |
| 1338 | 4.3% | |
| Other values (26) | 14077 |
Most occurring characters
| Value | Count | Frequency (%) |
| 28683 | ||
| 17082 | 10.7% | |
| 16736 | 10.4% | |
| 9457 | 5.9% | |
| 8836 | 5.5% | |
| 7771 | 4.9% | |
| 7699 | 4.8% | |
| 6691 | 4.2% | |
| 6233 | 3.9% | |
| 4966 | 3.1% | |
| Other values (20) | 46024 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 124514 | |
| Space Separator | 28683 | 17.9% |
| Other Punctuation | 4345 | 2.7% |
| Uppercase Letter | 2636 | 1.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 17082 | ||
| 16736 | ||
| 9457 | 7.6% | |
| 8836 | 7.1% | |
| 7771 | 6.2% | |
| 7699 | 6.2% | |
| 6691 | 5.4% | |
| 6233 | 5.0% | |
| 4966 | 4.0% | |
| 4893 | 3.9% | |
| Other values (13) | 34150 |
Other Punctuation
| Value | Count | Frequency (%) |
| 1870 | ||
| 1540 | ||
| 935 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1483 | ||
| 605 | ||
| 548 | 20.8% |
Space Separator
| Value | Count | Frequency (%) |
| 28683 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 127150 | |
| Common | 33028 | 20.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 17082 | ||
| 16736 | ||
| 9457 | 7.4% | |
| 8836 | 6.9% | |
| 7771 | 6.1% | |
| 7699 | 6.1% | |
| 6691 | 5.3% | |
| 6233 | 4.9% | |
| 4966 | 3.9% | |
| 4893 | 3.8% | |
| Other values (16) | 36786 |
Common
| Value | Count | Frequency (%) |
| 28683 | ||
| 1870 | 5.7% | |
| 1540 | 4.7% | |
| 935 | 2.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 160178 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 28683 | ||
| 17082 | 10.7% | |
| 16736 | 10.4% | |
| 9457 | 5.9% | |
| 8836 | 5.5% | |
| 7771 | 4.9% | |
| 7699 | 4.8% | |
| 6691 | 4.2% | |
| 6233 | 3.9% | |
| 4966 | 3.1% | |
| Other values (20) | 46024 |
situationalJudgement_9
Categorical
IMBALANCE  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25820 |
| Missing (%) | 90.6% |
| Memory size | 1.6 MiB |
| 106 | |
| 26 | |
| 21 |
Length
| Max length | 249 |
|---|---|
| Median length | 249 |
| Mean length | 242.35453 |
| Min length | 37 |
Characters and Unicode
| Total characters | 648056 |
|---|---|
| Distinct characters | 36 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2521 | 8.8% | |
| 106 | 0.4% | |
| 26 | 0.1% | |
| 21 | 0.1% | |
| (Missing) | 25820 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 10343 | 9.9% | |
| 7563 | 7.2% | |
| 5195 | 5.0% | |
| 5068 | 4.9% | |
| 2759 | 2.6% | |
| 2653 | 2.5% | |
| 2653 | 2.5% | |
| 2627 | 2.5% | |
| 2627 | 2.5% | |
| 2627 | 2.5% | |
| Other values (39) | 60369 |
Most occurring characters
| Value | Count | Frequency (%) |
| 104331 | ||
| 62075 | 9.6% | |
| 54543 | 8.4% | |
| 44567 | 6.9% | |
| 44070 | 6.8% | |
| 36547 | 5.6% | |
| 36187 | 5.6% | |
| 31093 | 4.8% | |
| 28524 | 4.4% | |
| 28414 | 4.4% | |
| Other values (26) | 177705 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 522615 | |
| Space Separator | 104331 | 16.1% |
| Other Punctuation | 12970 | 2.0% |
| Uppercase Letter | 7822 | 1.2% |
| Decimal Number | 212 | < 0.1% |
| Currency Symbol | 106 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 62075 | ||
| 54543 | ||
| 44567 | 8.5% | |
| 44070 | 8.4% | |
| 36547 | 7.0% | |
| 36187 | 6.9% | |
| 31093 | 5.9% | |
| 28524 | 5.5% | |
| 28414 | 5.4% | |
| 23414 | 4.5% | |
| Other values (14) | 133181 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2547 | ||
| 2521 | ||
| 2521 | ||
| 127 | 1.6% | |
| 106 | 1.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| 7822 | ||
| 2627 | 20.3% | |
| 2521 | 19.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 106 | ||
| 106 |
Space Separator
| Value | Count | Frequency (%) |
| 104331 |
Currency Symbol
| Value | Count | Frequency (%) |
| 106 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 530437 | |
| Common | 117619 | 18.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 62075 | ||
| 54543 | 10.3% | |
| 44567 | 8.4% | |
| 44070 | 8.3% | |
| 36547 | 6.9% | |
| 36187 | 6.8% | |
| 31093 | 5.9% | |
| 28524 | 5.4% | |
| 28414 | 5.4% | |
| 23414 | 4.4% | |
| Other values (19) | 141003 |
Common
| Value | Count | Frequency (%) |
| 104331 | ||
| 7822 | 6.7% | |
| 2627 | 2.2% | |
| 2521 | 2.1% | |
| 106 | 0.1% | |
| 106 | 0.1% | |
| 106 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 648056 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 104331 | ||
| 62075 | 9.6% | |
| 54543 | 8.4% | |
| 44567 | 6.9% | |
| 44070 | 6.8% | |
| 36547 | 5.6% | |
| 36187 | 5.6% | |
| 31093 | 4.8% | |
| 28524 | 4.4% | |
| 28414 | 4.4% | |
| Other values (26) | 177705 |
total
Real number (ℝ)
| Distinct | 16 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 11229 |
| Missing (%) | 39.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.286476 |
| Minimum | 0 |
|---|---|
| Maximum | 33 |
| Zeros | 3 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 222.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 10 |
| Q1 | 11 |
| median | 13 |
| Q3 | 15 |
| 95-th percentile | 33 |
| Maximum | 33 |
| Range | 33 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 7.0989102 |
|---|---|
| Coefficient of variation (CV) | 0.46439156 |
| Kurtosis | 2.1940269 |
| Mean | 15.286476 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 1.9351038 |
| Sum | 263921 |
| Variance | 50.394526 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11 | 3778 | 13.3% |
| 14 | 3432 | 12.0% |
| 15 | 2740 | 9.6% |
| 33 | 2276 | 8.0% |
| 12 | 2117 | 7.4% |
| 10 | 1552 | 5.4% |
| 13 | 1262 | 4.4% |
| 9 | 47 | 0.2% |
| 8 | 17 | 0.1% |
| 7 | 17 | 0.1% |
| Other values (6) | 27 | 0.1% |
| (Missing) | 11229 |
| Value | Count | Frequency (%) |
| 0 | 3 | < 0.1% |
| 1 | 6 | < 0.1% |
| 2 | 1 | < 0.1% |
| 4 | 2 | < 0.1% |
| 5 | 7 | < 0.1% |
| 6 | 8 | < 0.1% |
| 7 | 17 | 0.1% |
| 8 | 17 | 0.1% |
| 9 | 47 | 0.2% |
| 10 | 1552 |
| Value | Count | Frequency (%) |
| 33 | 2276 | |
| 15 | 2740 | |
| 14 | 3432 | |
| 13 | 1262 | 4.4% |
| 12 | 2117 | |
| 11 | 3778 | |
| 10 | 1552 | |
| 9 | 47 | 0.2% |
| 8 | 17 | 0.1% |
| 7 | 17 | 0.1% |
trial
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 2810 |
|---|---|
| Distinct (%) | 91.3% |
| Missing | 25415 |
| Missing (%) | 89.2% |
| Memory size | 1.3 MiB |
| 11 | |
| 10 | |
| 7 | |
| 6 | |
| 6 | |
Length
| Max length | 427 |
|---|---|
| Median length | 382 |
| Mean length | 137.1062 |
| Min length | 51 |
Characters and Unicode
| Total characters | 422150 |
|---|---|
| Distinct characters | 33 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2636 ? |
|---|---|
| Unique (%) | 85.6% |
Common Values
| Value | Count | Frequency (%) |
| 11 | < 0.1% | |
| 10 | < 0.1% | |
| 7 | < 0.1% | |
| 6 | < 0.1% | |
| 6 | < 0.1% | |
| 6 | < 0.1% | |
| 6 | < 0.1% | |
| 5 | < 0.1% | |
| 5 | < 0.1% | |
| 5 | < 0.1% | |
| Other values (2800) | 3012 | 10.6% |
| (Missing) | 25415 |
Length
| Value | Count | Frequency (%) |
| 11 | 0.4% | |
| 10 | 0.3% | |
| 7 | 0.2% | |
| 6 | 0.2% | |
| 6 | 0.2% | |
| 6 | 0.2% | |
| 6 | 0.2% | |
| 5 | 0.2% | |
| 5 | 0.2% | |
| 5 | 0.2% | |
| Other values (2800) | 3012 |
Most occurring characters
| Value | Count | Frequency (%) |
| 93828 | ||
| 31276 | 7.4% | |
| 29951 | 7.1% | |
| 25019 | 5.9% | |
| 23457 | 5.6% | |
| 23457 | 5.6% | |
| 20378 | 4.8% | |
| 15638 | 3.7% | |
| 15638 | 3.7% | |
| 15638 | 3.7% | |
| Other values (23) | 127870 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 181560 | |
| Other Punctuation | 153301 | |
| Decimal Number | 65493 | 15.5% |
| Open Punctuation | 10898 | 2.6% |
| Close Punctuation | 10898 | 2.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 31276 | ||
| 29951 | ||
| 23457 | ||
| 15638 | ||
| 15638 | ||
| 12590 | ||
| 12590 | ||
| 7819 | 4.3% | |
| 7819 | 4.3% | |
| 7819 | 4.3% | |
| Other values (5) | 16963 |
Decimal Number
| Value | Count | Frequency (%) |
| 25019 | ||
| 6322 | 9.7% | |
| 6105 | 9.3% | |
| 5707 | 8.7% | |
| 4729 | 7.2% | |
| 4662 | 7.1% | |
| 4396 | 6.7% | |
| 3824 | 5.8% | |
| 2466 | 3.8% | |
| 2263 | 3.5% |
Other Punctuation
| Value | Count | Frequency (%) |
| 93828 | ||
| 23457 | 15.3% | |
| 20378 | 13.3% | |
| 15638 | 10.2% |
Open Punctuation
| Value | Count | Frequency (%) |
| 7819 | ||
| 3079 | 28.3% |
Close Punctuation
| Value | Count | Frequency (%) |
| 7819 | ||
| 3079 | 28.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 240590 | |
| Latin | 181560 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 93828 | ||
| 25019 | 10.4% | |
| 23457 | 9.7% | |
| 20378 | 8.5% | |
| 15638 | 6.5% | |
| 7819 | 3.2% | |
| 7819 | 3.2% | |
| 6322 | 2.6% | |
| 6105 | 2.5% | |
| 5707 | 2.4% | |
| Other values (8) | 28498 | 11.8% |
Latin
| Value | Count | Frequency (%) |
| 31276 | ||
| 29951 | ||
| 23457 | ||
| 15638 | ||
| 15638 | ||
| 12590 | ||
| 12590 | ||
| 7819 | 4.3% | |
| 7819 | 4.3% | |
| 7819 | 4.3% | |
| Other values (5) | 16963 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 422150 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 93828 | ||
| 31276 | 7.4% | |
| 29951 | 7.1% | |
| 25019 | 5.9% | |
| 23457 | 5.6% | |
| 23457 | 5.6% | |
| 20378 | 4.8% | |
| 15638 | 3.7% | |
| 15638 | 3.7% | |
| 15638 | 3.7% | |
| Other values (23) | 127870 |
type
Categorical
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
Length
| Max length | 38 |
|---|---|
| Median length | 17 |
| Mean length | 12.326525 |
| Min length | 5 |
Characters and Unicode
| Total characters | 351232 |
|---|---|
| Distinct characters | 36 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 3797 | ||
| 3750 | ||
| 3535 | ||
| 3282 | ||
| 3079 | ||
| 2741 | ||
| 2276 | ||
| 2140 | ||
| 1812 | ||
| 1168 | 4.1% | |
| Other values (2) | 914 | 3.2% |
Length
| Value | Count | Frequency (%) |
| 3797 | ||
| 3750 | ||
| 3535 | ||
| 3282 | ||
| 3079 | ||
| 2741 | ||
| 2276 | ||
| 2140 | ||
| 1812 | ||
| 1168 | 4.1% | |
| Other values (2) | 914 | 3.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 39570 | ||
| 37958 | ||
| 36463 | ||
| 30219 | 8.6% | |
| 21969 | 6.3% | |
| 21714 | 6.2% | |
| 21126 | 6.0% | |
| 20722 | 5.9% | |
| 19178 | 5.5% | |
| 15255 | 4.3% | |
| Other values (26) | 87058 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 335755 | |
| Uppercase Letter | 11478 | 3.3% |
| Dash Punctuation | 3991 | 1.1% |
| Decimal Number | 6 | < 0.1% |
| Connector Punctuation | 2 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 39570 | ||
| 37958 | ||
| 36463 | ||
| 30219 | ||
| 21969 | 6.5% | |
| 21714 | 6.5% | |
| 21126 | 6.3% | |
| 20722 | 6.2% | |
| 19178 | 5.7% | |
| 15255 | 4.5% | |
| Other values (12) | 71581 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 4918 | ||
| 3797 | ||
| 2743 | ||
| 4 | < 0.1% | |
| 4 | < 0.1% | |
| 4 | < 0.1% | |
| 4 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | ||
| 2 | ||
| 2 |
Dash Punctuation
| Value | Count | Frequency (%) |
| 3991 |
Connector Punctuation
| Value | Count | Frequency (%) |
| 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 347233 | |
| Common | 3999 | 1.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 39570 | ||
| 37958 | ||
| 36463 | ||
| 30219 | 8.7% | |
| 21969 | 6.3% | |
| 21714 | 6.3% | |
| 21126 | 6.1% | |
| 20722 | 6.0% | |
| 19178 | 5.5% | |
| 15255 | 4.4% | |
| Other values (21) | 83059 |
Common
| Value | Count | Frequency (%) |
| 3991 | ||
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 351232 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 39570 | ||
| 37958 | ||
| 36463 | ||
| 30219 | 8.6% | |
| 21969 | 6.3% | |
| 21714 | 6.2% | |
| 21126 | 6.0% | |
| 20722 | 5.9% | |
| 19178 | 5.5% | |
| 15255 | 4.3% | |
| Other values (26) | 87058 |
user
Categorical
HIGH CARDINALITY  UNIFORM 
| Distinct | 3835 |
|---|---|
| Distinct (%) | 13.5% |
| Missing | 17 |
| Missing (%) | 0.1% |
| Memory size | 2.0 MiB |
| 19 | |
| 17 | |
| 16 | |
| 14 | |
| 13 | |
Length
| Max length | 17 |
|---|---|
| Median length | 17 |
| Mean length | 17 |
| Min length | 17 |
Characters and Unicode
| Total characters | 484109 |
|---|---|
| Distinct characters | 55 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 68 ? |
|---|---|
| Unique (%) | 0.2% |
Common Values
| Value | Count | Frequency (%) |
| 19 | 0.1% | |
| 17 | 0.1% | |
| 16 | 0.1% | |
| 14 | < 0.1% | |
| 13 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| Other values (3825) | 28343 | |
| (Missing) | 17 | 0.1% |
Length
| Value | Count | Frequency (%) |
| 19 | 0.1% | |
| 17 | 0.1% | |
| 16 | 0.1% | |
| 14 | < 0.1% | |
| 13 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| 11 | < 0.1% | |
| Other values (3825) | 28343 |
Most occurring characters
| Value | Count | Frequency (%) |
| 9339 | 1.9% | |
| 9222 | 1.9% | |
| 9219 | 1.9% | |
| 9207 | 1.9% | |
| 9199 | 1.9% | |
| 9182 | 1.9% | |
| 9147 | 1.9% | |
| 9076 | 1.9% | |
| 9051 | 1.9% | |
| 9041 | 1.9% | |
| Other values (45) | 392426 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 220526 | |
| Uppercase Letter | 193122 | |
| Decimal Number | 70461 | 14.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 9339 | 4.2% | |
| 9222 | 4.2% | |
| 9199 | 4.2% | |
| 9182 | 4.2% | |
| 9147 | 4.1% | |
| 9041 | 4.1% | |
| 9022 | 4.1% | |
| 8993 | 4.1% | |
| 8981 | 4.1% | |
| 8880 | 4.0% | |
| Other values (15) | 129520 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 9207 | 4.8% | |
| 9076 | 4.7% | |
| 9051 | 4.7% | |
| 8988 | 4.7% | |
| 8899 | 4.6% | |
| 8873 | 4.6% | |
| 8852 | 4.6% | |
| 8839 | 4.6% | |
| 8790 | 4.6% | |
| 8762 | 4.5% | |
| Other values (12) | 103785 |
Decimal Number
| Value | Count | Frequency (%) |
| 9219 | ||
| 9001 | ||
| 8849 | ||
| 8848 | ||
| 8765 | ||
| 8710 | ||
| 8699 | ||
| 8370 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 413648 | |
| Common | 70461 | 14.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 9339 | 2.3% | |
| 9222 | 2.2% | |
| 9207 | 2.2% | |
| 9199 | 2.2% | |
| 9182 | 2.2% | |
| 9147 | 2.2% | |
| 9076 | 2.2% | |
| 9051 | 2.2% | |
| 9041 | 2.2% | |
| 9022 | 2.2% | |
| Other values (37) | 322162 |
Common
| Value | Count | Frequency (%) |
| 9219 | ||
| 9001 | ||
| 8849 | ||
| 8848 | ||
| 8765 | ||
| 8710 | ||
| 8699 | ||
| 8370 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 484109 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 9339 | 1.9% | |
| 9222 | 1.9% | |
| 9219 | 1.9% | |
| 9207 | 1.9% | |
| 9199 | 1.9% | |
| 9182 | 1.9% | |
| 9147 | 1.9% | |
| 9076 | 1.9% | |
| 9051 | 1.9% | |
| 9041 | 1.9% | |
| Other values (45) | 392426 |
voice_1.GCSData
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 28494 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 222.7 KiB |
| Distinct | 3121 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25369 |
| Missing (%) | 89.0% |
| Memory size | 1.1 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3111) | 3111 | 10.9% |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3111) | 3111 | 10.9% |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
voice_1.fileName
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 3139 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 25350 |
| Missing (%) | 89.0% |
| Memory size | 1.0 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
Length
| Max length | 25 |
|---|---|
| Median length | 25 |
| Mean length | 25 |
| Min length | 25 |
Characters and Unicode
| Total characters | 78600 |
|---|---|
| Distinct characters | 58 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3134 ? |
|---|---|
| Unique (%) | 99.7% |
Common Values
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3129) | 3129 | 11.0% |
| (Missing) | 25350 |
Length
| Value | Count | Frequency (%) |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3129) | 3129 |
Most occurring characters
| Value | Count | Frequency (%) |
| 6288 | 8.0% | |
| 4153 | 5.3% | |
| 4134 | 5.3% | |
| 4107 | 5.2% | |
| 4069 | 5.2% | |
| 4064 | 5.2% | |
| 2646 | 3.4% | |
| 2221 | 2.8% | |
| 1225 | 1.6% | |
| 1065 | 1.4% | |
| Other values (48) | 44628 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 40010 | |
| Uppercase Letter | 21446 | |
| Decimal Number | 10856 | 13.8% |
| Connector Punctuation | 6288 | 8.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4153 | 10.4% | |
| 4134 | 10.3% | |
| 4107 | 10.3% | |
| 4069 | 10.2% | |
| 4064 | 10.2% | |
| 1017 | 2.5% | |
| 1016 | 2.5% | |
| 1009 | 2.5% | |
| 1003 | 2.5% | |
| 998 | 2.5% | |
| Other values (15) | 14440 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1065 | 5.0% | |
| 1033 | 4.8% | |
| 994 | 4.6% | |
| 993 | 4.6% | |
| 992 | 4.6% | |
| 990 | 4.6% | |
| 984 | 4.6% | |
| 983 | 4.6% | |
| 983 | 4.6% | |
| 980 | 4.6% | |
| Other values (12) | 11449 |
Decimal Number
| Value | Count | Frequency (%) |
| 2646 | ||
| 2221 | ||
| 1225 | ||
| 963 | 8.9% | |
| 956 | 8.8% | |
| 946 | 8.7% | |
| 945 | 8.7% | |
| 913 | 8.4% | |
| 37 | 0.3% | |
| 4 | < 0.1% |
Connector Punctuation
| Value | Count | Frequency (%) |
| 6288 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 61456 | |
| Common | 17144 | 21.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4153 | 6.8% | |
| 4134 | 6.7% | |
| 4107 | 6.7% | |
| 4069 | 6.6% | |
| 4064 | 6.6% | |
| 1065 | 1.7% | |
| 1033 | 1.7% | |
| 1017 | 1.7% | |
| 1016 | 1.7% | |
| 1009 | 1.6% | |
| Other values (37) | 35789 |
Common
| Value | Count | Frequency (%) |
| 6288 | ||
| 2646 | ||
| 2221 | 13.0% | |
| 1225 | 7.1% | |
| 963 | 5.6% | |
| 956 | 5.6% | |
| 946 | 5.5% | |
| 945 | 5.5% | |
| 913 | 5.3% | |
| 37 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 78600 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6288 | 8.0% | |
| 4153 | 5.3% | |
| 4134 | 5.3% | |
| 4107 | 5.2% | |
| 4069 | 5.2% | |
| 4064 | 5.2% | |
| 2646 | 3.4% | |
| 2221 | 2.8% | |
| 1225 | 1.6% | |
| 1065 | 1.4% | |
| Other values (48) | 44628 |
voice_1.prompt
Categorical
IMBALANCE  MISSING 
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 25350 |
| Missing (%) | 89.0% |
| Memory size | 1.9 MiB |
| 35 | |
| 5 | |
| 3 |
Length
| Max length | 322 |
|---|---|
| Median length | 305 |
| Mean length | 308.32824 |
| Min length | 57 |
Characters and Unicode
| Total characters | 969384 |
|---|---|
| Distinct characters | 41 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1884 | 6.6% | |
| 1215 | 4.3% | |
| 35 | 0.1% | |
| 5 | < 0.1% | |
| 3 | < 0.1% | |
| 2 | < 0.1% | |
| (Missing) | 25350 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 14375 | 8.1% | |
| 13632 | 7.7% | |
| 10517 | 5.9% | |
| 6203 | 3.5% | |
| 3778 | 2.1% | |
| 3773 | 2.1% | |
| 3771 | 2.1% | |
| 3768 | 2.1% | |
| 3680 | 2.1% | |
| 3645 | 2.0% | |
| Other values (109) | 110730 |
Most occurring characters
| Value | Count | Frequency (%) |
| 174728 | ||
| 81105 | 8.4% | |
| 78971 | 8.1% | |
| 74848 | 7.7% | |
| 59400 | 6.1% | |
| 52740 | 5.4% | |
| 43514 | 4.5% | |
| 43510 | 4.5% | |
| 38033 | 3.9% | |
| 37288 | 3.8% | |
| Other values (31) | 285247 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 758358 | |
| Space Separator | 174728 | 18.0% |
| Other Punctuation | 17530 | 1.8% |
| Uppercase Letter | 13116 | 1.4% |
| Decimal Number | 5652 | 0.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 81105 | 10.7% | |
| 78971 | 10.4% | |
| 74848 | 9.9% | |
| 59400 | 7.8% | |
| 52740 | 7.0% | |
| 43514 | 5.7% | |
| 43510 | 5.7% | |
| 38033 | 5.0% | |
| 37288 | 4.9% | |
| 34958 | 4.6% | |
| Other values (14) | 213991 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 5018 | ||
| 3702 | ||
| 1886 | 14.4% | |
| 1215 | 9.3% | |
| 1215 | 9.3% | |
| 73 | 0.6% | |
| 5 | < 0.1% | |
| 2 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 10602 | ||
| 3780 | 21.6% | |
| 1884 | 10.7% | |
| 1224 | 7.0% | |
| 40 | 0.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 1884 | ||
| 1884 | ||
| 1884 |
Space Separator
| Value | Count | Frequency (%) |
| 174728 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 771474 | |
| Common | 197910 | 20.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 81105 | 10.5% | |
| 78971 | 10.2% | |
| 74848 | 9.7% | |
| 59400 | 7.7% | |
| 52740 | 6.8% | |
| 43514 | 5.6% | |
| 43510 | 5.6% | |
| 38033 | 4.9% | |
| 37288 | 4.8% | |
| 34958 | 4.5% | |
| Other values (22) | 227107 |
Common
| Value | Count | Frequency (%) |
| 174728 | ||
| 10602 | 5.4% | |
| 3780 | 1.9% | |
| 1884 | 1.0% | |
| 1884 | 1.0% | |
| 1884 | 1.0% | |
| 1884 | 1.0% | |
| 1224 | 0.6% | |
| 40 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 969384 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 174728 | ||
| 81105 | 8.4% | |
| 78971 | 8.1% | |
| 74848 | 7.7% | |
| 59400 | 6.1% | |
| 52740 | 5.4% | |
| 43514 | 4.5% | |
| 43510 | 4.5% | |
| 38033 | 3.9% | |
| 37288 | 3.8% | |
| Other values (31) | 285247 |
| Distinct | 3253 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25237 |
| Missing (%) | 88.6% |
| Memory size | 1.1 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3243) | 3243 | 11.4% |
| (Missing) | 25237 |
| Value | Count | Frequency (%) |
| 3257 | 11.4% | |
| (Missing) | 25237 |
| Value | Count | Frequency (%) |
| 3257 | 11.4% | |
| (Missing) | 25237 |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3243) | 3243 | 11.4% |
| (Missing) | 25237 |
| Value | Count | Frequency (%) |
| 3257 | 11.4% | |
| (Missing) | 25237 |
| Value | Count | Frequency (%) |
| 3257 | 11.4% | |
| (Missing) | 25237 |
voice_10.fileName
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 3277 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 25212 |
| Missing (%) | 88.5% |
| Memory size | 1.0 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
Length
| Max length | 26 |
|---|---|
| Median length | 26 |
| Mean length | 25.884522 |
| Min length | 25 |
Characters and Unicode
| Total characters | 84953 |
|---|---|
| Distinct characters | 58 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3272 ? |
|---|---|
| Unique (%) | 99.7% |
Common Values
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3267) | 3267 | 11.5% |
| (Missing) | 25212 |
Length
| Value | Count | Frequency (%) |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3267) | 3267 |
Most occurring characters
| Value | Count | Frequency (%) |
| 6564 | 7.7% | |
| 4347 | 5.1% | |
| 4313 | 5.1% | |
| 4283 | 5.0% | |
| 4246 | 5.0% | |
| 4244 | 5.0% | |
| 2903 | 3.4% | |
| 2903 | 3.4% | |
| 1307 | 1.5% | |
| 1110 | 1.3% | |
| Other values (48) | 48733 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 41758 | |
| Uppercase Letter | 22399 | |
| Decimal Number | 14232 | 16.8% |
| Connector Punctuation | 6564 | 7.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4347 | 10.4% | |
| 4313 | 10.3% | |
| 4283 | 10.3% | |
| 4246 | 10.2% | |
| 4244 | 10.2% | |
| 1081 | 2.6% | |
| 1058 | 2.5% | |
| 1054 | 2.5% | |
| 1054 | 2.5% | |
| 1038 | 2.5% | |
| Other values (15) | 15040 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1110 | 5.0% | |
| 1079 | 4.8% | |
| 1040 | 4.6% | |
| 1038 | 4.6% | |
| 1032 | 4.6% | |
| 1032 | 4.6% | |
| 1029 | 4.6% | |
| 1027 | 4.6% | |
| 1025 | 4.6% | |
| 1023 | 4.6% | |
| Other values (12) | 11964 |
Decimal Number
| Value | Count | Frequency (%) |
| 2903 | ||
| 2903 | ||
| 1307 | ||
| 1059 | 7.4% | |
| 1048 | 7.4% | |
| 1014 | 7.1% | |
| 1007 | 7.1% | |
| 1002 | 7.0% | |
| 1000 | 7.0% | |
| 989 | 6.9% |
Connector Punctuation
| Value | Count | Frequency (%) |
| 6564 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 64157 | |
| Common | 20796 | 24.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4347 | 6.8% | |
| 4313 | 6.7% | |
| 4283 | 6.7% | |
| 4246 | 6.6% | |
| 4244 | 6.6% | |
| 1110 | 1.7% | |
| 1081 | 1.7% | |
| 1079 | 1.7% | |
| 1058 | 1.6% | |
| 1054 | 1.6% | |
| Other values (37) | 37342 |
Common
| Value | Count | Frequency (%) |
| 6564 | ||
| 2903 | ||
| 2903 | ||
| 1307 | 6.3% | |
| 1059 | 5.1% | |
| 1048 | 5.0% | |
| 1014 | 4.9% | |
| 1007 | 4.8% | |
| 1002 | 4.8% | |
| 1000 | 4.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 84953 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6564 | 7.7% | |
| 4347 | 5.1% | |
| 4313 | 5.1% | |
| 4283 | 5.0% | |
| 4246 | 5.0% | |
| 4244 | 5.0% | |
| 2903 | 3.4% | |
| 2903 | 3.4% | |
| 1307 | 1.5% | |
| 1110 | 1.3% | |
| Other values (48) | 48733 |
voice_10.prompt
Categorical
IMBALANCE  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25212 |
| Missing (%) | 88.5% |
| Memory size | 1.1 MiB |
| 31 | |
| 7 |
Length
| Max length | 349 |
|---|---|
| Median length | 34 |
| Mean length | 34.898537 |
| Min length | 34 |
Characters and Unicode
| Total characters | 114537 |
|---|---|
| Distinct characters | 29 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 3244 | 11.4% | |
| 31 | 0.1% | |
| 7 | < 0.1% | |
| (Missing) | 25212 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3282 | ||
| 3275 | ||
| 3275 | ||
| 3244 | ||
| 3244 | ||
| 3244 | ||
| 52 | 0.3% | |
| 38 | 0.2% | |
| 35 | 0.2% | |
| 31 | 0.2% | |
| Other values (48) | 516 | 2.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 16954 | ||
| 10045 | 8.8% | |
| 10006 | 8.7% | |
| 9781 | 8.5% | |
| 6834 | 6.0% | |
| 6725 | 5.9% | |
| 6676 | 5.8% | |
| 6661 | 5.8% | |
| 6651 | 5.8% | |
| 6509 | 5.7% | |
| Other values (19) | 27695 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 90921 | |
| Space Separator | 16954 | 14.8% |
| Other Punctuation | 3338 | 2.9% |
| Uppercase Letter | 3324 | 2.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 10045 | ||
| 10006 | ||
| 9781 | ||
| 6834 | 7.5% | |
| 6725 | 7.4% | |
| 6676 | 7.3% | |
| 6661 | 7.3% | |
| 6651 | 7.3% | |
| 6509 | 7.2% | |
| 3456 | 3.8% | |
| Other values (12) | 17577 |
Other Punctuation
| Value | Count | Frequency (%) |
| 3282 | ||
| 28 | 0.8% | |
| 21 | 0.6% | |
| 7 | 0.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| 3275 | ||
| 49 | 1.5% |
Space Separator
| Value | Count | Frequency (%) |
| 16954 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 94245 | |
| Common | 20292 | 17.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 10045 | ||
| 10006 | ||
| 9781 | ||
| 6834 | 7.3% | |
| 6725 | 7.1% | |
| 6676 | 7.1% | |
| 6661 | 7.1% | |
| 6651 | 7.1% | |
| 6509 | 6.9% | |
| 3456 | 3.7% | |
| Other values (14) | 20901 |
Common
| Value | Count | Frequency (%) |
| 16954 | ||
| 3282 | 16.2% | |
| 28 | 0.1% | |
| 21 | 0.1% | |
| 7 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 114537 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 16954 | ||
| 10045 | 8.8% | |
| 10006 | 8.7% | |
| 9781 | 8.5% | |
| 6834 | 6.0% | |
| 6725 | 5.9% | |
| 6676 | 5.8% | |
| 6661 | 5.8% | |
| 6651 | 5.8% | |
| 6509 | 5.7% | |
| Other values (19) | 27695 |
voice_2.GCSData
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 28494 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 222.7 KiB |
| Distinct | 2919 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25572 |
| Missing (%) | 89.7% |
| Memory size | 1.1 MiB |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
| 1 | |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (2909) | 2909 | 10.2% |
| (Missing) | 25572 |
| Value | Count | Frequency (%) |
| 2922 | 10.3% | |
| (Missing) | 25572 |
| Value | Count | Frequency (%) |
| 2922 | 10.3% | |
| (Missing) | 25572 |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (2909) | 2909 | 10.2% |
| (Missing) | 25572 |
| Value | Count | Frequency (%) |
| 2922 | 10.3% | |
| (Missing) | 25572 |
| Value | Count | Frequency (%) |
| 2922 | 10.3% | |
| (Missing) | 25572 |
voice_2.fileName
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 2923 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25567 |
| Missing (%) | 89.7% |
| Memory size | 1.0 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
Length
| Max length | 25 |
|---|---|
| Median length | 25 |
| Mean length | 25 |
| Min length | 25 |
Characters and Unicode
| Total characters | 73175 |
|---|---|
| Distinct characters | 58 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2919 ? |
|---|---|
| Unique (%) | 99.7% |
Common Values
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (2913) | 2913 | 10.2% |
| (Missing) | 25567 |
Length
| Value | Count | Frequency (%) |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (2913) | 2913 |
Most occurring characters
| Value | Count | Frequency (%) |
| 5854 | 8.0% | |
| 3861 | 5.3% | |
| 3849 | 5.3% | |
| 3816 | 5.2% | |
| 3787 | 5.2% | |
| 3785 | 5.2% | |
| 2888 | 3.9% | |
| 998 | 1.4% | |
| 957 | 1.3% | |
| 948 | 1.3% | |
| Other values (48) | 42432 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 37218 | |
| Uppercase Letter | 19987 | |
| Decimal Number | 10116 | 13.8% |
| Connector Punctuation | 5854 | 8.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 3861 | 10.4% | |
| 3849 | 10.3% | |
| 3816 | 10.3% | |
| 3787 | 10.2% | |
| 3785 | 10.2% | |
| 947 | 2.5% | |
| 946 | 2.5% | |
| 933 | 2.5% | |
| 929 | 2.5% | |
| 927 | 2.5% | |
| Other values (15) | 13438 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 998 | 5.0% | |
| 957 | 4.8% | |
| 942 | 4.7% | |
| 931 | 4.7% | |
| 926 | 4.6% | |
| 926 | 4.6% | |
| 920 | 4.6% | |
| 919 | 4.6% | |
| 911 | 4.6% | |
| 907 | 4.5% | |
| Other values (12) | 10650 |
Decimal Number
| Value | Count | Frequency (%) |
| 2888 | ||
| 948 | 9.4% | |
| 937 | 9.3% | |
| 915 | 9.0% | |
| 896 | 8.9% | |
| 889 | 8.8% | |
| 881 | 8.7% | |
| 879 | 8.7% | |
| 844 | 8.3% | |
| 39 | 0.4% |
Connector Punctuation
| Value | Count | Frequency (%) |
| 5854 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 57205 | |
| Common | 15970 | 21.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 3861 | 6.7% | |
| 3849 | 6.7% | |
| 3816 | 6.7% | |
| 3787 | 6.6% | |
| 3785 | 6.6% | |
| 998 | 1.7% | |
| 957 | 1.7% | |
| 947 | 1.7% | |
| 946 | 1.7% | |
| 942 | 1.6% | |
| Other values (37) | 33317 |
Common
| Value | Count | Frequency (%) |
| 5854 | ||
| 2888 | ||
| 948 | 5.9% | |
| 937 | 5.9% | |
| 915 | 5.7% | |
| 896 | 5.6% | |
| 889 | 5.6% | |
| 881 | 5.5% | |
| 879 | 5.5% | |
| 844 | 5.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 73175 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5854 | 8.0% | |
| 3861 | 5.3% | |
| 3849 | 5.3% | |
| 3816 | 5.2% | |
| 3787 | 5.2% | |
| 3785 | 5.2% | |
| 2888 | 3.9% | |
| 998 | 1.4% | |
| 957 | 1.3% | |
| 948 | 1.3% | |
| Other values (48) | 42432 |
voice_2.prompt
Categorical
IMBALANCE  MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25567 |
| Missing (%) | 89.7% |
| Memory size | 1.1 MiB |
| 39 |
Length
| Max length | 77 |
|---|---|
| Median length | 57 |
| Mean length | 57.266484 |
| Min length | 57 |
Characters and Unicode
| Total characters | 167619 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 2888 | 10.1% | |
| 39 | 0.1% | |
| (Missing) | 25567 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 5893 | ||
| 2927 | ||
| 2888 | ||
| 2888 | ||
| 2888 | ||
| 2888 | ||
| 2888 | ||
| 2888 | ||
| 2888 | ||
| 2888 | ||
| Other values (10) | 390 | 1.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 29387 | ||
| 17523 | 10.5% | |
| 14674 | 8.8% | |
| 11591 | 6.9% | |
| 8937 | 5.3% | |
| 8859 | 5.3% | |
| 8781 | 5.2% | |
| 5932 | 3.5% | |
| 5932 | 3.5% | |
| 5854 | 3.5% | |
| Other values (20) | 50149 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 117665 | |
| Space Separator | 29387 | 17.5% |
| Uppercase Letter | 11708 | 7.0% |
| Other Punctuation | 8859 | 5.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 17523 | ||
| 14674 | ||
| 11591 | ||
| 8937 | 7.6% | |
| 8859 | 7.5% | |
| 8781 | 7.5% | |
| 5932 | 5.0% | |
| 5932 | 5.0% | |
| 5854 | 5.0% | |
| 5776 | 4.9% | |
| Other values (10) | 23806 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 5776 | ||
| 2966 | ||
| 2888 | ||
| 39 | 0.3% | |
| 39 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| 5815 | ||
| 2927 | ||
| 78 | 0.9% | |
| 39 | 0.4% |
Space Separator
| Value | Count | Frequency (%) |
| 29387 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 129373 | |
| Common | 38246 | 22.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 17523 | ||
| 14674 | ||
| 11591 | 9.0% | |
| 8937 | 6.9% | |
| 8859 | 6.8% | |
| 8781 | 6.8% | |
| 5932 | 4.6% | |
| 5932 | 4.6% | |
| 5854 | 4.5% | |
| 5776 | 4.5% | |
| Other values (15) | 35514 |
Common
| Value | Count | Frequency (%) |
| 29387 | ||
| 5815 | 15.2% | |
| 2927 | 7.7% | |
| 78 | 0.2% | |
| 39 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 167619 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 29387 | ||
| 17523 | 10.5% | |
| 14674 | 8.8% | |
| 11591 | 6.9% | |
| 8937 | 5.3% | |
| 8859 | 5.3% | |
| 8781 | 5.2% | |
| 5932 | 3.5% | |
| 5932 | 3.5% | |
| 5854 | 3.5% | |
| Other values (20) | 50149 |
voice_3.GCSData
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 28494 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 222.7 KiB |
| Distinct | 3053 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25438 |
| Missing (%) | 89.3% |
| Memory size | 1.1 MiB |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
| 1 | |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3043) | 3043 | 10.7% |
| (Missing) | 25438 |
| Value | Count | Frequency (%) |
| 3056 | 10.7% | |
| (Missing) | 25438 |
| Value | Count | Frequency (%) |
| 3056 | 10.7% | |
| (Missing) | 25438 |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3043) | 3043 | 10.7% |
| (Missing) | 25438 |
| Value | Count | Frequency (%) |
| 3056 | 10.7% | |
| (Missing) | 25438 |
| Value | Count | Frequency (%) |
| 3056 | 10.7% | |
| (Missing) | 25438 |
voice_3.fileName
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 3061 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25429 |
| Missing (%) | 89.2% |
| Memory size | 1.0 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
Length
| Max length | 25 |
|---|---|
| Median length | 25 |
| Mean length | 25 |
| Min length | 25 |
Characters and Unicode
| Total characters | 76625 |
|---|---|
| Distinct characters | 57 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3057 ? |
|---|---|
| Unique (%) | 99.7% |
Common Values
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3051) | 3051 | 10.7% |
| (Missing) | 25429 |
Length
| Value | Count | Frequency (%) |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3051) | 3051 |
Most occurring characters
| Value | Count | Frequency (%) |
| 6130 | 8.0% | |
| 4055 | 5.3% | |
| 4028 | 5.3% | |
| 3992 | 5.2% | |
| 3965 | 5.2% | |
| 3964 | 5.2% | |
| 2589 | 3.4% | |
| 2171 | 2.8% | |
| 1153 | 1.5% | |
| 1043 | 1.4% | |
| Other values (47) | 43535 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 38966 | |
| Uppercase Letter | 20940 | |
| Decimal Number | 10589 | 13.8% |
| Connector Punctuation | 6130 | 8.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4055 | 10.4% | |
| 4028 | 10.3% | |
| 3992 | 10.2% | |
| 3965 | 10.2% | |
| 3964 | 10.2% | |
| 1011 | 2.6% | |
| 992 | 2.5% | |
| 975 | 2.5% | |
| 970 | 2.5% | |
| 968 | 2.5% | |
| Other values (15) | 14046 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1043 | 5.0% | |
| 1003 | 4.8% | |
| 981 | 4.7% | |
| 980 | 4.7% | |
| 976 | 4.7% | |
| 972 | 4.6% | |
| 968 | 4.6% | |
| 951 | 4.5% | |
| 950 | 4.5% | |
| 950 | 4.5% | |
| Other values (12) | 11166 |
Decimal Number
| Value | Count | Frequency (%) |
| 2589 | ||
| 2171 | ||
| 1153 | ||
| 991 | 9.4% | |
| 933 | 8.8% | |
| 931 | 8.8% | |
| 920 | 8.7% | |
| 900 | 8.5% | |
| 1 | < 0.1% |
Connector Punctuation
| Value | Count | Frequency (%) |
| 6130 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 59906 | |
| Common | 16719 | 21.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4055 | 6.8% | |
| 4028 | 6.7% | |
| 3992 | 6.7% | |
| 3965 | 6.6% | |
| 3964 | 6.6% | |
| 1043 | 1.7% | |
| 1011 | 1.7% | |
| 1003 | 1.7% | |
| 992 | 1.7% | |
| 981 | 1.6% | |
| Other values (37) | 34872 |
Common
| Value | Count | Frequency (%) |
| 6130 | ||
| 2589 | ||
| 2171 | 13.0% | |
| 1153 | 6.9% | |
| 991 | 5.9% | |
| 933 | 5.6% | |
| 931 | 5.6% | |
| 920 | 5.5% | |
| 900 | 5.4% | |
| 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 76625 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6130 | 8.0% | |
| 4055 | 5.3% | |
| 4028 | 5.3% | |
| 3992 | 5.2% | |
| 3965 | 5.2% | |
| 3964 | 5.2% | |
| 2589 | 3.4% | |
| 2171 | 2.8% | |
| 1153 | 1.5% | |
| 1043 | 1.4% | |
| Other values (47) | 43535 |
voice_3.prompt
Categorical
IMBALANCE  MISSING 
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 25429 |
| Missing (%) | 89.2% |
| Memory size | 1.6 MiB |
| 35 | |
| 3 | |
| 3 |
Length
| Max length | 349 |
|---|---|
| Median length | 135 |
| Mean length | 211.09103 |
| Min length | 135 |
Characters and Unicode
| Total characters | 646994 |
|---|---|
| Distinct characters | 40 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Common Values
| Value | Count | Frequency (%) |
| 1672 | 5.9% | |
| 1351 | 4.7% | |
| 35 | 0.1% | |
| 3 | < 0.1% | |
| 3 | < 0.1% | |
| 1 | < 0.1% | |
| (Missing) | 25429 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 11862 | 9.9% | |
| 8883 | 7.4% | |
| 4734 | 4.0% | |
| 4374 | 3.7% | |
| 4123 | 3.5% | |
| 3345 | 2.8% | |
| 3058 | 2.6% | |
| 3023 | 2.5% | |
| 2773 | 2.3% | |
| 2750 | 2.3% | |
| Other values (146) | 70316 |
Most occurring characters
| Value | Count | Frequency (%) |
| 116176 | ||
| 54032 | 8.4% | |
| 51591 | 8.0% | |
| 48269 | 7.5% | |
| 39745 | 6.1% | |
| 35856 | 5.5% | |
| 29106 | 4.5% | |
| 27627 | 4.3% | |
| 26582 | 4.1% | |
| 26362 | 4.1% | |
| Other values (30) | 191648 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 507483 | |
| Space Separator | 116176 | 18.0% |
| Other Punctuation | 11708 | 1.8% |
| Uppercase Letter | 7574 | 1.2% |
| Decimal Number | 4053 | 0.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 54032 | 10.6% | |
| 51591 | 10.2% | |
| 48269 | 9.5% | |
| 39745 | 7.8% | |
| 35856 | 7.1% | |
| 29106 | 5.7% | |
| 27627 | 5.4% | |
| 26582 | 5.2% | |
| 26362 | 5.2% | |
| 24962 | 4.9% | |
| Other values (14) | 143351 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 2708 | ||
| 1796 | ||
| 1672 | ||
| 1354 | ||
| 38 | 0.5% | |
| 3 | < 0.1% | |
| 3 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 5854 | ||
| 2788 | ||
| 1673 | 14.3% | |
| 1351 | 11.5% | |
| 42 | 0.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 1351 | ||
| 1351 | ||
| 1351 |
Space Separator
| Value | Count | Frequency (%) |
| 116176 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 515057 | |
| Common | 131937 | 20.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 54032 | 10.5% | |
| 51591 | 10.0% | |
| 48269 | 9.4% | |
| 39745 | 7.7% | |
| 35856 | 7.0% | |
| 29106 | 5.7% | |
| 27627 | 5.4% | |
| 26582 | 5.2% | |
| 26362 | 5.1% | |
| 24962 | 4.8% | |
| Other values (21) | 150925 |
Common
| Value | Count | Frequency (%) |
| 116176 | ||
| 5854 | 4.4% | |
| 2788 | 2.1% | |
| 1673 | 1.3% | |
| 1351 | 1.0% | |
| 1351 | 1.0% | |
| 1351 | 1.0% | |
| 1351 | 1.0% | |
| 42 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 646994 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 116176 | ||
| 54032 | 8.4% | |
| 51591 | 8.0% | |
| 48269 | 7.5% | |
| 39745 | 6.1% | |
| 35856 | 5.5% | |
| 29106 | 4.5% | |
| 27627 | 4.3% | |
| 26582 | 4.1% | |
| 26362 | 4.1% | |
| Other values (30) | 191648 |
voice_4.GCSData
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 28494 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 222.7 KiB |
| Distinct | 3121 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25369 |
| Missing (%) | 89.0% |
| Memory size | 1.1 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3111) | 3111 | 10.9% |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3111) | 3111 | 10.9% |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
voice_4.fileName
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 3139 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 25350 |
| Missing (%) | 89.0% |
| Memory size | 1.0 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
Length
| Max length | 25 |
|---|---|
| Median length | 25 |
| Mean length | 25 |
| Min length | 25 |
Characters and Unicode
| Total characters | 78600 |
|---|---|
| Distinct characters | 58 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3134 ? |
|---|---|
| Unique (%) | 99.7% |
Common Values
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3129) | 3129 | 11.0% |
| (Missing) | 25350 |
Length
| Value | Count | Frequency (%) |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3129) | 3129 |
Most occurring characters
| Value | Count | Frequency (%) |
| 6288 | 8.0% | |
| 4153 | 5.3% | |
| 4134 | 5.3% | |
| 4107 | 5.2% | |
| 4069 | 5.2% | |
| 4064 | 5.2% | |
| 2695 | 3.4% | |
| 2141 | 2.7% | |
| 1065 | 1.4% | |
| 1048 | 1.3% | |
| Other values (48) | 44836 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 40010 | |
| Uppercase Letter | 21446 | |
| Decimal Number | 10856 | 13.8% |
| Connector Punctuation | 6288 | 8.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4153 | 10.4% | |
| 4134 | 10.3% | |
| 4107 | 10.3% | |
| 4069 | 10.2% | |
| 4064 | 10.2% | |
| 1017 | 2.5% | |
| 1016 | 2.5% | |
| 1009 | 2.5% | |
| 1003 | 2.5% | |
| 998 | 2.5% | |
| Other values (15) | 14440 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1065 | 5.0% | |
| 1033 | 4.8% | |
| 994 | 4.6% | |
| 993 | 4.6% | |
| 992 | 4.6% | |
| 990 | 4.6% | |
| 984 | 4.6% | |
| 983 | 4.6% | |
| 983 | 4.6% | |
| 980 | 4.6% | |
| Other values (12) | 11449 |
Decimal Number
| Value | Count | Frequency (%) |
| 2695 | ||
| 2141 | ||
| 1048 | 9.7% | |
| 973 | 9.0% | |
| 963 | 8.9% | |
| 956 | 8.8% | |
| 946 | 8.7% | |
| 914 | 8.4% | |
| 216 | 2.0% | |
| 4 | < 0.1% |
Connector Punctuation
| Value | Count | Frequency (%) |
| 6288 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 61456 | |
| Common | 17144 | 21.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4153 | 6.8% | |
| 4134 | 6.7% | |
| 4107 | 6.7% | |
| 4069 | 6.6% | |
| 4064 | 6.6% | |
| 1065 | 1.7% | |
| 1033 | 1.7% | |
| 1017 | 1.7% | |
| 1016 | 1.7% | |
| 1009 | 1.6% | |
| Other values (37) | 35789 |
Common
| Value | Count | Frequency (%) |
| 6288 | ||
| 2695 | ||
| 2141 | 12.5% | |
| 1048 | 6.1% | |
| 973 | 5.7% | |
| 963 | 5.6% | |
| 956 | 5.6% | |
| 946 | 5.5% | |
| 914 | 5.3% | |
| 216 | 1.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 78600 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6288 | 8.0% | |
| 4153 | 5.3% | |
| 4134 | 5.3% | |
| 4107 | 5.2% | |
| 4069 | 5.2% | |
| 4064 | 5.2% | |
| 2695 | 3.4% | |
| 2141 | 2.7% | |
| 1065 | 1.4% | |
| 1048 | 1.3% | |
| Other values (48) | 44836 |
voice_4.prompt
Categorical
IMBALANCE  MISSING 
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 25350 |
| Missing (%) | 89.0% |
| Memory size | 1.5 MiB |
| 42 | |
| 7 | |
| 1 |
Length
| Max length | 348 |
|---|---|
| Median length | 196 |
| Mean length | 174.20356 |
| Min length | 77 |
Characters and Unicode
| Total characters | 547696 |
|---|---|
| Distinct characters | 37 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Common Values
| Value | Count | Frequency (%) |
| 1895 | 6.7% | |
| 1199 | 4.2% | |
| 42 | 0.1% | |
| 7 | < 0.1% | |
| 1 | < 0.1% | |
| (Missing) | 25350 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 8297 | 7.8% | |
| 7579 | 7.1% | |
| 5080 | 4.8% | |
| 4301 | 4.0% | |
| 3790 | 3.6% | |
| 3790 | 3.6% | |
| 3094 | 2.9% | |
| 3094 | 2.9% | |
| 3094 | 2.9% | |
| 2398 | 2.3% | |
| Other values (102) | 61998 |
Most occurring characters
| Value | Count | Frequency (%) |
| 103371 | ||
| 53202 | 9.7% | |
| 45246 | 8.3% | |
| 37523 | 6.9% | |
| 33942 | 6.2% | |
| 25837 | 4.7% | |
| 25442 | 4.6% | |
| 23205 | 4.2% | |
| 22788 | 4.2% | |
| 21713 | 4.0% | |
| Other values (27) | 155427 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 420042 | |
| Space Separator | 103371 | 18.9% |
| Other Punctuation | 14021 | 2.6% |
| Uppercase Letter | 10262 | 1.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 53202 | ||
| 45246 | ||
| 37523 | 8.9% | |
| 33942 | 8.1% | |
| 25837 | 6.2% | |
| 25442 | 6.1% | |
| 23205 | 5.5% | |
| 22788 | 5.4% | |
| 21713 | 5.2% | |
| 20196 | 4.8% | |
| Other values (14) | 110948 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 7025 | ||
| 1896 | 18.5% | |
| 1199 | 11.7% | |
| 43 | 0.4% | |
| 42 | 0.4% | |
| 42 | 0.4% | |
| 8 | 0.1% | |
| 7 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 7062 | ||
| 3802 | ||
| 1951 | 13.9% | |
| 1206 | 8.6% |
Space Separator
| Value | Count | Frequency (%) |
| 103371 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 430304 | |
| Common | 117392 | 21.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 53202 | ||
| 45246 | 10.5% | |
| 37523 | 8.7% | |
| 33942 | 7.9% | |
| 25837 | 6.0% | |
| 25442 | 5.9% | |
| 23205 | 5.4% | |
| 22788 | 5.3% | |
| 21713 | 5.0% | |
| 20196 | 4.7% | |
| Other values (22) | 121210 |
Common
| Value | Count | Frequency (%) |
| 103371 | ||
| 7062 | 6.0% | |
| 3802 | 3.2% | |
| 1951 | 1.7% | |
| 1206 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 547696 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 103371 | ||
| 53202 | 9.7% | |
| 45246 | 8.3% | |
| 37523 | 6.9% | |
| 33942 | 6.2% | |
| 25837 | 4.7% | |
| 25442 | 4.6% | |
| 23205 | 4.2% | |
| 22788 | 4.2% | |
| 21713 | 4.0% | |
| Other values (27) | 155427 |
voice_5.GCSData
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 28494 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 222.7 KiB |
| Distinct | 3052 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25439 |
| Missing (%) | 89.3% |
| Memory size | 1.1 MiB |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
| 1 | |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3042) | 3042 | 10.7% |
| (Missing) | 25439 |
| Value | Count | Frequency (%) |
| 3055 | 10.7% | |
| (Missing) | 25439 |
| Value | Count | Frequency (%) |
| 3055 | 10.7% | |
| (Missing) | 25439 |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3042) | 3042 | 10.7% |
| (Missing) | 25439 |
| Value | Count | Frequency (%) |
| 3055 | 10.7% | |
| (Missing) | 25439 |
| Value | Count | Frequency (%) |
| 3055 | 10.7% | |
| (Missing) | 25439 |
voice_5.fileName
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 3061 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25429 |
| Missing (%) | 89.2% |
| Memory size | 1.0 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
Length
| Max length | 25 |
|---|---|
| Median length | 25 |
| Mean length | 25 |
| Min length | 25 |
Characters and Unicode
| Total characters | 76625 |
|---|---|
| Distinct characters | 56 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3057 ? |
|---|---|
| Unique (%) | 99.7% |
Common Values
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3051) | 3051 | 10.7% |
| (Missing) | 25429 |
Length
| Value | Count | Frequency (%) |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3051) | 3051 |
Most occurring characters
| Value | Count | Frequency (%) |
| 6130 | 8.0% | |
| 4055 | 5.3% | |
| 4028 | 5.3% | |
| 3992 | 5.2% | |
| 3965 | 5.2% | |
| 3964 | 5.2% | |
| 2568 | 3.4% | |
| 2135 | 2.8% | |
| 1121 | 1.5% | |
| 1043 | 1.4% | |
| Other values (46) | 43624 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 38966 | |
| Uppercase Letter | 20940 | |
| Decimal Number | 10589 | 13.8% |
| Connector Punctuation | 6130 | 8.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4055 | 10.4% | |
| 4028 | 10.3% | |
| 3992 | 10.2% | |
| 3965 | 10.2% | |
| 3964 | 10.2% | |
| 1011 | 2.6% | |
| 992 | 2.5% | |
| 975 | 2.5% | |
| 970 | 2.5% | |
| 968 | 2.5% | |
| Other values (15) | 14046 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1043 | 5.0% | |
| 1003 | 4.8% | |
| 981 | 4.7% | |
| 980 | 4.7% | |
| 976 | 4.7% | |
| 972 | 4.6% | |
| 968 | 4.6% | |
| 951 | 4.5% | |
| 950 | 4.5% | |
| 950 | 4.5% | |
| Other values (12) | 11166 |
Decimal Number
| Value | Count | Frequency (%) |
| 2568 | ||
| 2135 | ||
| 1121 | ||
| 995 | 9.4% | |
| 987 | 9.3% | |
| 933 | 8.8% | |
| 930 | 8.8% | |
| 920 | 8.7% |
Connector Punctuation
| Value | Count | Frequency (%) |
| 6130 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 59906 | |
| Common | 16719 | 21.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4055 | 6.8% | |
| 4028 | 6.7% | |
| 3992 | 6.7% | |
| 3965 | 6.6% | |
| 3964 | 6.6% | |
| 1043 | 1.7% | |
| 1011 | 1.7% | |
| 1003 | 1.7% | |
| 992 | 1.7% | |
| 981 | 1.6% | |
| Other values (37) | 34872 |
Common
| Value | Count | Frequency (%) |
| 6130 | ||
| 2568 | ||
| 2135 | 12.8% | |
| 1121 | 6.7% | |
| 995 | 6.0% | |
| 987 | 5.9% | |
| 933 | 5.6% | |
| 930 | 5.6% | |
| 920 | 5.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 76625 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6130 | 8.0% | |
| 4055 | 5.3% | |
| 4028 | 5.3% | |
| 3992 | 5.2% | |
| 3965 | 5.2% | |
| 3964 | 5.2% | |
| 2568 | 3.4% | |
| 2135 | 2.8% | |
| 1121 | 1.5% | |
| 1043 | 1.4% | |
| Other values (46) | 43624 |
voice_5.prompt
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 25429 |
| Missing (%) | 89.2% |
| Memory size | 2.0 MiB |
| 43 | |
| 13 |
Length
| Max length | 349 |
|---|---|
| Median length | 348 |
| Mean length | 346.92985 |
| Min length | 135 |
Characters and Unicode
| Total characters | 1063340 |
|---|---|
| Distinct characters | 37 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1671 | 5.9% | |
| 1338 | 4.7% | |
| 43 | 0.2% | |
| 13 | < 0.1% | |
| (Missing) | 25429 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 16845 | 8.9% | |
| 10896 | 5.7% | |
| 5685 | 3.0% | |
| 5352 | 2.8% | |
| 5267 | 2.8% | |
| 5013 | 2.6% | |
| 4766 | 2.5% | |
| 4433 | 2.3% | |
| 3385 | 1.8% | |
| 3342 | 1.8% | |
| Other values (107) | 124646 |
Most occurring characters
| Value | Count | Frequency (%) |
| 186565 | ||
| 119968 | ||
| 85506 | 8.0% | |
| 69527 | 6.5% | |
| 55877 | 5.3% | |
| 55076 | 5.2% | |
| 54225 | 5.1% | |
| 51487 | 4.8% | |
| 50627 | 4.8% | |
| 42750 | 4.0% | |
| Other values (27) | 291732 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 834747 | |
| Space Separator | 186565 | 17.5% |
| Other Punctuation | 25694 | 2.4% |
| Uppercase Letter | 16205 | 1.5% |
| Decimal Number | 129 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 119968 | ||
| 85506 | 10.2% | |
| 69527 | 8.3% | |
| 55877 | 6.7% | |
| 55076 | 6.6% | |
| 54225 | 6.5% | |
| 51487 | 6.2% | |
| 50627 | 6.1% | |
| 42750 | 5.1% | |
| 35059 | 4.2% | |
| Other values (13) | 214645 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 11050 | ||
| 1757 | 10.8% | |
| 1714 | 10.6% | |
| 1671 | 10.3% | |
| 13 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 9779 | ||
| 9169 | ||
| 5352 | ||
| 1351 | 5.3% | |
| 43 | 0.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 43 | ||
| 43 | ||
| 43 |
Space Separator
| Value | Count | Frequency (%) |
| 186565 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 850952 | |
| Common | 212388 | 20.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 119968 | ||
| 85506 | 10.0% | |
| 69527 | 8.2% | |
| 55877 | 6.6% | |
| 55076 | 6.5% | |
| 54225 | 6.4% | |
| 51487 | 6.1% | |
| 50627 | 5.9% | |
| 42750 | 5.0% | |
| 35059 | 4.1% | |
| Other values (18) | 230850 |
Common
| Value | Count | Frequency (%) |
| 186565 | ||
| 9779 | 4.6% | |
| 9169 | 4.3% | |
| 5352 | 2.5% | |
| 1351 | 0.6% | |
| 43 | < 0.1% | |
| 43 | < 0.1% | |
| 43 | < 0.1% | |
| 43 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1063340 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 186565 | ||
| 119968 | ||
| 85506 | 8.0% | |
| 69527 | 6.5% | |
| 55877 | 5.3% | |
| 55076 | 5.2% | |
| 54225 | 5.1% | |
| 51487 | 4.8% | |
| 50627 | 4.8% | |
| 42750 | 4.0% | |
| Other values (27) | 291732 |
voice_6.GCSData
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 28494 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 222.7 KiB |
| Distinct | 3053 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25438 |
| Missing (%) | 89.3% |
| Memory size | 1.1 MiB |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
| 1 | |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3043) | 3043 | 10.7% |
| (Missing) | 25438 |
| Value | Count | Frequency (%) |
| 3056 | 10.7% | |
| (Missing) | 25438 |
| Value | Count | Frequency (%) |
| 3056 | 10.7% | |
| (Missing) | 25438 |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3043) | 3043 | 10.7% |
| (Missing) | 25438 |
| Value | Count | Frequency (%) |
| 3056 | 10.7% | |
| (Missing) | 25438 |
| Value | Count | Frequency (%) |
| 3056 | 10.7% | |
| (Missing) | 25438 |
voice_6.fileName
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 3061 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25429 |
| Missing (%) | 89.2% |
| Memory size | 1.0 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
Length
| Max length | 25 |
|---|---|
| Median length | 25 |
| Mean length | 25 |
| Min length | 25 |
Characters and Unicode
| Total characters | 76625 |
|---|---|
| Distinct characters | 58 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3057 ? |
|---|---|
| Unique (%) | 99.7% |
Common Values
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3051) | 3051 | 10.7% |
| (Missing) | 25429 |
Length
| Value | Count | Frequency (%) |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3051) | 3051 |
Most occurring characters
| Value | Count | Frequency (%) |
| 6130 | 8.0% | |
| 4055 | 5.3% | |
| 4028 | 5.3% | |
| 3992 | 5.2% | |
| 3965 | 5.2% | |
| 3964 | 5.2% | |
| 2663 | 3.5% | |
| 1201 | 1.6% | |
| 1043 | 1.4% | |
| 1011 | 1.3% | |
| Other values (48) | 44573 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 38966 | |
| Uppercase Letter | 20940 | |
| Decimal Number | 10589 | 13.8% |
| Connector Punctuation | 6130 | 8.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4055 | 10.4% | |
| 4028 | 10.3% | |
| 3992 | 10.2% | |
| 3965 | 10.2% | |
| 3964 | 10.2% | |
| 1011 | 2.6% | |
| 992 | 2.5% | |
| 975 | 2.5% | |
| 970 | 2.5% | |
| 968 | 2.5% | |
| Other values (15) | 14046 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1043 | 5.0% | |
| 1003 | 4.8% | |
| 981 | 4.7% | |
| 980 | 4.7% | |
| 976 | 4.7% | |
| 972 | 4.6% | |
| 968 | 4.6% | |
| 951 | 4.5% | |
| 950 | 4.5% | |
| 950 | 4.5% | |
| Other values (12) | 11166 |
Decimal Number
| Value | Count | Frequency (%) |
| 2663 | ||
| 1201 | ||
| 995 | 9.4% | |
| 960 | 9.1% | |
| 951 | 9.0% | |
| 933 | 8.8% | |
| 931 | 8.8% | |
| 920 | 8.7% | |
| 897 | 8.5% | |
| 138 | 1.3% |
Connector Punctuation
| Value | Count | Frequency (%) |
| 6130 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 59906 | |
| Common | 16719 | 21.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4055 | 6.8% | |
| 4028 | 6.7% | |
| 3992 | 6.7% | |
| 3965 | 6.6% | |
| 3964 | 6.6% | |
| 1043 | 1.7% | |
| 1011 | 1.7% | |
| 1003 | 1.7% | |
| 992 | 1.7% | |
| 981 | 1.6% | |
| Other values (37) | 34872 |
Common
| Value | Count | Frequency (%) |
| 6130 | ||
| 2663 | ||
| 1201 | 7.2% | |
| 995 | 6.0% | |
| 960 | 5.7% | |
| 951 | 5.7% | |
| 933 | 5.6% | |
| 931 | 5.6% | |
| 920 | 5.5% | |
| 897 | 5.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 76625 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6130 | 8.0% | |
| 4055 | 5.3% | |
| 4028 | 5.3% | |
| 3992 | 5.2% | |
| 3965 | 5.2% | |
| 3964 | 5.2% | |
| 2663 | 3.5% | |
| 1201 | 1.6% | |
| 1043 | 1.4% | |
| 1011 | 1.3% | |
| Other values (48) | 44573 |
voice_6.prompt
Categorical
IMBALANCE  MISSING 
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 25429 |
| Missing (%) | 89.2% |
| Memory size | 1.6 MiB |
| 34 | |
| 14 | |
| 4 |
Length
| Max length | 322 |
|---|---|
| Median length | 322 |
| Mean length | 212.44209 |
| Min length | 57 |
Characters and Unicode
| Total characters | 651135 |
|---|---|
| Distinct characters | 41 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1676 | 5.9% | |
| 1335 | 4.7% | |
| 34 | 0.1% | |
| 14 | < 0.1% | |
| 4 | < 0.1% | |
| 2 | < 0.1% | |
| (Missing) | 25429 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 10863 | 9.1% | |
| 8506 | 7.1% | |
| 6730 | 5.6% | |
| 5030 | 4.2% | |
| 5028 | 4.2% | |
| 4751 | 4.0% | |
| 3374 | 2.8% | |
| 3366 | 2.8% | |
| 3352 | 2.8% | |
| 3011 | 2.5% | |
| Other values (109) | 65205 |
Most occurring characters
| Value | Count | Frequency (%) |
| 116151 | ||
| 69483 | 10.7% | |
| 59520 | 9.1% | |
| 44576 | 6.8% | |
| 40571 | 6.2% | |
| 30448 | 4.7% | |
| 29260 | 4.5% | |
| 27119 | 4.2% | |
| 26462 | 4.1% | |
| 22809 | 3.5% | |
| Other values (31) | 184736 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 504195 | |
| Space Separator | 116151 | 17.8% |
| Uppercase Letter | 15540 | 2.4% |
| Other Punctuation | 15237 | 2.3% |
| Decimal Number | 12 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 69483 | ||
| 59520 | ||
| 44576 | 8.8% | |
| 40571 | 8.0% | |
| 30448 | 6.0% | |
| 29260 | 5.8% | |
| 27119 | 5.4% | |
| 26462 | 5.2% | |
| 22809 | 4.5% | |
| 21673 | 4.3% | |
| Other values (14) | 132274 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 7776 | ||
| 1686 | 10.8% | |
| 1676 | 10.8% | |
| 1676 | 10.8% | |
| 1339 | 8.6% | |
| 1335 | 8.6% | |
| 38 | 0.2% | |
| 14 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 8131 | ||
| 4360 | ||
| 1371 | 9.0% | |
| 1371 | 9.0% | |
| 4 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | ||
| 4 | ||
| 4 |
Space Separator
| Value | Count | Frequency (%) |
| 116151 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 519735 | |
| Common | 131400 | 20.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 69483 | ||
| 59520 | ||
| 44576 | 8.6% | |
| 40571 | 7.8% | |
| 30448 | 5.9% | |
| 29260 | 5.6% | |
| 27119 | 5.2% | |
| 26462 | 5.1% | |
| 22809 | 4.4% | |
| 21673 | 4.2% | |
| Other values (22) | 147814 |
Common
| Value | Count | Frequency (%) |
| 116151 | ||
| 8131 | 6.2% | |
| 4360 | 3.3% | |
| 1371 | 1.0% | |
| 1371 | 1.0% | |
| 4 | < 0.1% | |
| 4 | < 0.1% | |
| 4 | < 0.1% | |
| 4 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 651135 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 116151 | ||
| 69483 | 10.7% | |
| 59520 | 9.1% | |
| 44576 | 6.8% | |
| 40571 | 6.2% | |
| 30448 | 4.7% | |
| 29260 | 4.5% | |
| 27119 | 4.2% | |
| 26462 | 4.1% | |
| 22809 | 3.5% | |
| Other values (31) | 184736 |
voice_7.GCSData
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 28494 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 222.7 KiB |
| Distinct | 3253 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25237 |
| Missing (%) | 88.6% |
| Memory size | 1.1 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3243) | 3243 | 11.4% |
| (Missing) | 25237 |
| Value | Count | Frequency (%) |
| 3257 | 11.4% | |
| (Missing) | 25237 |
| Value | Count | Frequency (%) |
| 3257 | 11.4% | |
| (Missing) | 25237 |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3243) | 3243 | 11.4% |
| (Missing) | 25237 |
| Value | Count | Frequency (%) |
| 3257 | 11.4% | |
| (Missing) | 25237 |
| Value | Count | Frequency (%) |
| 3257 | 11.4% | |
| (Missing) | 25237 |
voice_7.fileName
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 3277 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 25212 |
| Missing (%) | 88.5% |
| Memory size | 1.0 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
Length
| Max length | 25 |
|---|---|
| Median length | 25 |
| Mean length | 25 |
| Min length | 25 |
Characters and Unicode
| Total characters | 82050 |
|---|---|
| Distinct characters | 58 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3272 ? |
|---|---|
| Unique (%) | 99.7% |
Common Values
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3267) | 3267 | 11.5% |
| (Missing) | 25212 |
Length
| Value | Count | Frequency (%) |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3267) | 3267 |
Most occurring characters
| Value | Count | Frequency (%) |
| 6564 | 8.0% | |
| 4347 | 5.3% | |
| 4313 | 5.3% | |
| 4283 | 5.2% | |
| 4246 | 5.2% | |
| 4244 | 5.2% | |
| 2668 | 3.3% | |
| 2253 | 2.7% | |
| 1266 | 1.5% | |
| 1110 | 1.4% | |
| Other values (48) | 46756 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 41758 | |
| Uppercase Letter | 22399 | |
| Decimal Number | 11329 | 13.8% |
| Connector Punctuation | 6564 | 8.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4347 | 10.4% | |
| 4313 | 10.3% | |
| 4283 | 10.3% | |
| 4246 | 10.2% | |
| 4244 | 10.2% | |
| 1081 | 2.6% | |
| 1058 | 2.5% | |
| 1054 | 2.5% | |
| 1054 | 2.5% | |
| 1038 | 2.5% | |
| Other values (15) | 15040 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1110 | 5.0% | |
| 1079 | 4.8% | |
| 1040 | 4.6% | |
| 1038 | 4.6% | |
| 1032 | 4.6% | |
| 1032 | 4.6% | |
| 1029 | 4.6% | |
| 1027 | 4.6% | |
| 1025 | 4.6% | |
| 1023 | 4.6% | |
| Other values (12) | 11964 |
Decimal Number
| Value | Count | Frequency (%) |
| 2668 | ||
| 2253 | ||
| 1266 | ||
| 1019 | 9.0% | |
| 1014 | 9.0% | |
| 1000 | 8.8% | |
| 987 | 8.7% | |
| 981 | 8.7% | |
| 140 | 1.2% | |
| 1 | < 0.1% |
Connector Punctuation
| Value | Count | Frequency (%) |
| 6564 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 64157 | |
| Common | 17893 | 21.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4347 | 6.8% | |
| 4313 | 6.7% | |
| 4283 | 6.7% | |
| 4246 | 6.6% | |
| 4244 | 6.6% | |
| 1110 | 1.7% | |
| 1081 | 1.7% | |
| 1079 | 1.7% | |
| 1058 | 1.6% | |
| 1054 | 1.6% | |
| Other values (37) | 37342 |
Common
| Value | Count | Frequency (%) |
| 6564 | ||
| 2668 | ||
| 2253 | 12.6% | |
| 1266 | 7.1% | |
| 1019 | 5.7% | |
| 1014 | 5.7% | |
| 1000 | 5.6% | |
| 987 | 5.5% | |
| 981 | 5.5% | |
| 140 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 82050 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6564 | 8.0% | |
| 4347 | 5.3% | |
| 4313 | 5.3% | |
| 4283 | 5.2% | |
| 4246 | 5.2% | |
| 4244 | 5.2% | |
| 2668 | 3.3% | |
| 2253 | 2.7% | |
| 1266 | 1.5% | |
| 1110 | 1.4% | |
| Other values (48) | 46756 |
voice_7.prompt
Categorical
IMBALANCE  MISSING 
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 25212 |
| Missing (%) | 88.5% |
| Memory size | 1.8 MiB |
| 53 | |
| 7 | |
| 3 | |
| 4 |
Length
| Max length | 349 |
|---|---|
| Median length | 349 |
| Mean length | 286.156 |
| Min length | 77 |
Characters and Unicode
| Total characters | 939164 |
|---|---|
| Distinct characters | 41 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Common Values
| Value | Count | Frequency (%) |
| 1879 | 6.6% | |
| 1336 | 4.7% | |
| 53 | 0.2% | |
| 7 | < 0.1% | |
| 3 | < 0.1% | |
| 3 | < 0.1% | |
| 1 | < 0.1% | |
| (Missing) | 25212 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 11074 | 6.2% | |
| 9896 | 5.5% | |
| 7520 | 4.2% | |
| 6438 | 3.6% | |
| 5692 | 3.2% | |
| 5161 | 2.9% | |
| 4565 | 2.5% | |
| 3758 | 2.1% | |
| 3758 | 2.1% | |
| 3758 | 2.1% | |
| Other values (151) | 118193 |
Most occurring characters
| Value | Count | Frequency (%) |
| 176531 | ||
| 81827 | 8.7% | |
| 73696 | 7.8% | |
| 71175 | 7.6% | |
| 51996 | 5.5% | |
| 46379 | 4.9% | |
| 44155 | 4.7% | |
| 43764 | 4.7% | |
| 42029 | 4.5% | |
| 40534 | 4.3% | |
| Other values (31) | 267078 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 718439 | |
| Space Separator | 176531 | 18.8% |
| Other Punctuation | 25419 | 2.7% |
| Uppercase Letter | 18754 | 2.0% |
| Decimal Number | 21 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 81827 | ||
| 73696 | 10.3% | |
| 71175 | 9.9% | |
| 51996 | 7.2% | |
| 46379 | 6.5% | |
| 44155 | 6.1% | |
| 43764 | 6.1% | |
| 42029 | 5.9% | |
| 40534 | 5.6% | |
| 27629 | 3.8% | |
| Other values (14) | 195255 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 17226 | ||
| 1389 | 7.4% | |
| 68 | 0.4% | |
| 63 | 0.3% | |
| 3 | < 0.1% | |
| 3 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 9835 | ||
| 8859 | ||
| 4833 | ||
| 1885 | 7.4% | |
| 7 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 7 | ||
| 7 | ||
| 7 |
Space Separator
| Value | Count | Frequency (%) |
| 176531 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 737193 | |
| Common | 201971 | 21.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 81827 | 11.1% | |
| 73696 | 10.0% | |
| 71175 | 9.7% | |
| 51996 | 7.1% | |
| 46379 | 6.3% | |
| 44155 | 6.0% | |
| 43764 | 5.9% | |
| 42029 | 5.7% | |
| 40534 | 5.5% | |
| 27629 | 3.7% | |
| Other values (22) | 214009 |
Common
| Value | Count | Frequency (%) |
| 176531 | ||
| 9835 | 4.9% | |
| 8859 | 4.4% | |
| 4833 | 2.4% | |
| 1885 | 0.9% | |
| 7 | < 0.1% | |
| 7 | < 0.1% | |
| 7 | < 0.1% | |
| 7 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 939164 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 176531 | ||
| 81827 | 8.7% | |
| 73696 | 7.8% | |
| 71175 | 7.6% | |
| 51996 | 5.5% | |
| 46379 | 4.9% | |
| 44155 | 4.7% | |
| 43764 | 4.7% | |
| 42029 | 4.5% | |
| 40534 | 4.3% | |
| Other values (31) | 267078 |
voice_8.GCSData
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 28494 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 222.7 KiB |
| Distinct | 3121 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25369 |
| Missing (%) | 89.0% |
| Memory size | 1.1 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3111) | 3111 | 10.9% |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3111) | 3111 | 10.9% |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
| Value | Count | Frequency (%) |
| 3125 | 11.0% | |
| (Missing) | 25369 |
voice_8.fileName
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 3139 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 25350 |
| Missing (%) | 89.0% |
| Memory size | 1.0 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
Length
| Max length | 25 |
|---|---|
| Median length | 25 |
| Mean length | 25 |
| Min length | 25 |
Characters and Unicode
| Total characters | 78600 |
|---|---|
| Distinct characters | 58 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3134 ? |
|---|---|
| Unique (%) | 99.7% |
Common Values
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3129) | 3129 | 11.0% |
| (Missing) | 25350 |
Length
| Value | Count | Frequency (%) |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3129) | 3129 |
Most occurring characters
| Value | Count | Frequency (%) |
| 6288 | 8.0% | |
| 4153 | 5.3% | |
| 4134 | 5.3% | |
| 4107 | 5.2% | |
| 4069 | 5.2% | |
| 4064 | 5.2% | |
| 2110 | 2.7% | |
| 1687 | 2.1% | |
| 1065 | 1.4% | |
| 1033 | 1.3% | |
| Other values (48) | 45890 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 40010 | |
| Uppercase Letter | 21446 | |
| Decimal Number | 10856 | 13.8% |
| Connector Punctuation | 6288 | 8.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4153 | 10.4% | |
| 4134 | 10.3% | |
| 4107 | 10.3% | |
| 4069 | 10.2% | |
| 4064 | 10.2% | |
| 1017 | 2.5% | |
| 1016 | 2.5% | |
| 1009 | 2.5% | |
| 1003 | 2.5% | |
| 998 | 2.5% | |
| Other values (15) | 14440 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1065 | 5.0% | |
| 1033 | 4.8% | |
| 994 | 4.6% | |
| 993 | 4.6% | |
| 992 | 4.6% | |
| 990 | 4.6% | |
| 984 | 4.6% | |
| 983 | 4.6% | |
| 983 | 4.6% | |
| 980 | 4.6% | |
| Other values (12) | 11449 |
Decimal Number
| Value | Count | Frequency (%) |
| 2110 | ||
| 1687 | ||
| 1016 | ||
| 1006 | ||
| 999 | ||
| 973 | ||
| 956 | ||
| 946 | ||
| 944 | ||
| 219 | 2.0% |
Connector Punctuation
| Value | Count | Frequency (%) |
| 6288 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 61456 | |
| Common | 17144 | 21.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4153 | 6.8% | |
| 4134 | 6.7% | |
| 4107 | 6.7% | |
| 4069 | 6.6% | |
| 4064 | 6.6% | |
| 1065 | 1.7% | |
| 1033 | 1.7% | |
| 1017 | 1.7% | |
| 1016 | 1.7% | |
| 1009 | 1.6% | |
| Other values (37) | 35789 |
Common
| Value | Count | Frequency (%) |
| 6288 | ||
| 2110 | 12.3% | |
| 1687 | 9.8% | |
| 1016 | 5.9% | |
| 1006 | 5.9% | |
| 999 | 5.8% | |
| 973 | 5.7% | |
| 956 | 5.6% | |
| 946 | 5.5% | |
| 944 | 5.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 78600 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6288 | 8.0% | |
| 4153 | 5.3% | |
| 4134 | 5.3% | |
| 4107 | 5.2% | |
| 4069 | 5.2% | |
| 4064 | 5.2% | |
| 2110 | 2.7% | |
| 1687 | 2.1% | |
| 1065 | 1.4% | |
| 1033 | 1.3% | |
| Other values (48) | 45890 |
voice_8.prompt
Categorical
IMBALANCE  MISSING 
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 25350 |
| Missing (%) | 89.0% |
| Memory size | 1.5 MiB |
| 36 | |
| 4 | |
| 3 |
Length
| Max length | 349 |
|---|---|
| Median length | 77 |
| Mean length | 183.46024 |
| Min length | 57 |
Characters and Unicode
| Total characters | 576799 |
|---|---|
| Distinct characters | 34 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Common Values
| Value | Count | Frequency (%) |
| 1902 | 6.7% | |
| 1197 | 4.2% | |
| 36 | 0.1% | |
| 4 | < 0.1% | |
| 3 | < 0.1% | |
| 2 | < 0.1% | |
| (Missing) | 25350 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 9317 | 9.3% | |
| 7366 | 7.4% | |
| 4912 | 4.9% | |
| 3591 | 3.6% | |
| 3143 | 3.1% | |
| 2434 | 2.4% | |
| 2394 | 2.4% | |
| 2394 | 2.4% | |
| 2394 | 2.4% | |
| 2394 | 2.4% | |
| Other values (111) | 59527 |
Most occurring characters
| Value | Count | Frequency (%) |
| 96722 | ||
| 65038 | 11.3% | |
| 43371 | 7.5% | |
| 35667 | 6.2% | |
| 32910 | 5.7% | |
| 31255 | 5.4% | |
| 30638 | 5.3% | |
| 24704 | 4.3% | |
| 23648 | 4.1% | |
| 23027 | 4.0% | |
| Other values (24) | 169819 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 447950 | |
| Space Separator | 96722 | 16.8% |
| Other Punctuation | 19447 | 3.4% |
| Uppercase Letter | 12680 | 2.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 65038 | ||
| 43371 | 9.7% | |
| 35667 | 8.0% | |
| 32910 | 7.3% | |
| 31255 | 7.0% | |
| 30638 | 6.8% | |
| 24704 | 5.5% | |
| 23648 | 5.3% | |
| 23027 | 5.1% | |
| 22962 | 5.1% | |
| Other values (13) | 114730 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 5270 | ||
| 3099 | ||
| 1902 | 15.0% | |
| 1201 | 9.5% | |
| 1200 | 9.5% | |
| 8 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 7931 | ||
| 5621 | ||
| 3952 | ||
| 1943 | 10.0% |
Space Separator
| Value | Count | Frequency (%) |
| 96722 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 460630 | |
| Common | 116169 | 20.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 65038 | ||
| 43371 | 9.4% | |
| 35667 | 7.7% | |
| 32910 | 7.1% | |
| 31255 | 6.8% | |
| 30638 | 6.7% | |
| 24704 | 5.4% | |
| 23648 | 5.1% | |
| 23027 | 5.0% | |
| 22962 | 5.0% | |
| Other values (19) | 127410 |
Common
| Value | Count | Frequency (%) |
| 96722 | ||
| 7931 | 6.8% | |
| 5621 | 4.8% | |
| 3952 | 3.4% | |
| 1943 | 1.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 576799 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 96722 | ||
| 65038 | 11.3% | |
| 43371 | 7.5% | |
| 35667 | 6.2% | |
| 32910 | 5.7% | |
| 31255 | 5.4% | |
| 30638 | 5.3% | |
| 24704 | 4.3% | |
| 23648 | 4.1% | |
| 23027 | 4.0% | |
| Other values (24) | 169819 |
| Distinct | 3254 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 25236 |
| Missing (%) | 88.6% |
| Memory size | 1.1 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3244) | 3244 | 11.4% |
| (Missing) | 25236 |
| Value | Count | Frequency (%) |
| 3258 | 11.4% | |
| (Missing) | 25236 |
| Value | Count | Frequency (%) |
| 3258 | 11.4% | |
| (Missing) | 25236 |
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3244) | 3244 | 11.4% |
| (Missing) | 25236 |
| Value | Count | Frequency (%) |
| 3258 | 11.4% | |
| (Missing) | 25236 |
| Value | Count | Frequency (%) |
| 3258 | 11.4% | |
| (Missing) | 25236 |
voice_9.fileName
Categorical
HIGH CARDINALITY  MISSING  UNIFORM 
| Distinct | 3277 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 25212 |
| Missing (%) | 88.5% |
| Memory size | 1.0 MiB |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
Length
| Max length | 26 |
|---|---|
| Median length | 25 |
| Mean length | 25.001828 |
| Min length | 25 |
Characters and Unicode
| Total characters | 82056 |
|---|---|
| Distinct characters | 58 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3272 ? |
|---|---|
| Unique (%) | 99.7% |
Common Values
| Value | Count | Frequency (%) |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 2 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3267) | 3267 | 11.5% |
| (Missing) | 25212 |
Length
| Value | Count | Frequency (%) |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 2 | 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| 1 | < 0.1% | |
| Other values (3267) | 3267 |
Most occurring characters
| Value | Count | Frequency (%) |
| 6564 | 8.0% | |
| 4347 | 5.3% | |
| 4313 | 5.3% | |
| 4283 | 5.2% | |
| 4246 | 5.2% | |
| 4244 | 5.2% | |
| 3894 | 4.7% | |
| 1324 | 1.6% | |
| 1110 | 1.4% | |
| 1081 | 1.3% | |
| Other values (48) | 46650 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 41758 | |
| Uppercase Letter | 22399 | |
| Decimal Number | 11335 | 13.8% |
| Connector Punctuation | 6564 | 8.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 4347 | 10.4% | |
| 4313 | 10.3% | |
| 4283 | 10.3% | |
| 4246 | 10.2% | |
| 4244 | 10.2% | |
| 1081 | 2.6% | |
| 1058 | 2.5% | |
| 1054 | 2.5% | |
| 1054 | 2.5% | |
| 1038 | 2.5% | |
| Other values (15) | 15040 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 1110 | 5.0% | |
| 1079 | 4.8% | |
| 1040 | 4.6% | |
| 1038 | 4.6% | |
| 1032 | 4.6% | |
| 1032 | 4.6% | |
| 1029 | 4.6% | |
| 1027 | 4.6% | |
| 1025 | 4.6% | |
| 1023 | 4.6% | |
| Other values (12) | 11964 |
Decimal Number
| Value | Count | Frequency (%) |
| 3894 | ||
| 1324 | 11.7% | |
| 1062 | 9.4% | |
| 1051 | 9.3% | |
| 1014 | 8.9% | |
| 1008 | 8.9% | |
| 1000 | 8.8% | |
| 970 | 8.6% | |
| 6 | 0.1% | |
| 6 | 0.1% |
Connector Punctuation
| Value | Count | Frequency (%) |
| 6564 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 64157 | |
| Common | 17899 | 21.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 4347 | 6.8% | |
| 4313 | 6.7% | |
| 4283 | 6.7% | |
| 4246 | 6.6% | |
| 4244 | 6.6% | |
| 1110 | 1.7% | |
| 1081 | 1.7% | |
| 1079 | 1.7% | |
| 1058 | 1.6% | |
| 1054 | 1.6% | |
| Other values (37) | 37342 |
Common
| Value | Count | Frequency (%) |
| 6564 | ||
| 3894 | ||
| 1324 | 7.4% | |
| 1062 | 5.9% | |
| 1051 | 5.9% | |
| 1014 | 5.7% | |
| 1008 | 5.6% | |
| 1000 | 5.6% | |
| 970 | 5.4% | |
| 6 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 82056 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6564 | 8.0% | |
| 4347 | 5.3% | |
| 4313 | 5.3% | |
| 4283 | 5.2% | |
| 4246 | 5.2% | |
| 4244 | 5.2% | |
| 3894 | 4.7% | |
| 1324 | 1.6% | |
| 1110 | 1.4% | |
| 1081 | 1.3% | |
| Other values (48) | 46650 |
voice_9.prompt
Categorical
IMBALANCE  MISSING 
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 25212 |
| Missing (%) | 88.5% |
| Memory size | 1.1 MiB |
| 16 | |
| 7 | |
| 3 | |
| 3 |
Length
| Max length | 349 |
|---|---|
| Median length | 58 |
| Mean length | 59.881779 |
| Min length | 34 |
Characters and Unicode
| Total characters | 196532 |
|---|---|
| Distinct characters | 38 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Common Values
| Value | Count | Frequency (%) |
| 3252 | 11.4% | |
| 16 | 0.1% | |
| 7 | < 0.1% | |
| 3 | < 0.1% | |
| 3 | < 0.1% | |
| 1 | < 0.1% | |
| (Missing) | 25212 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3303 | ||
| 3275 | ||
| 3268 | ||
| 3260 | ||
| 3259 | ||
| 3258 | ||
| 3252 | ||
| 3252 | ||
| 3252 | ||
| 3252 | ||
| Other values (114) | 1357 |
Most occurring characters
| Value | Count | Frequency (%) |
| 30706 | ||
| 26340 | ||
| 20224 | ||
| 16682 | ||
| 13613 | 6.9% | |
| 13223 | 6.7% | |
| 10133 | 5.2% | |
| 10012 | 5.1% | |
| 10010 | 5.1% | |
| 7051 | 3.6% | |
| Other values (28) | 38538 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 158976 | |
| Space Separator | 30706 | 15.6% |
| Other Punctuation | 3447 | 1.8% |
| Uppercase Letter | 3394 | 1.7% |
| Decimal Number | 9 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| 26340 | ||
| 20224 | ||
| 16682 | ||
| 13613 | ||
| 13223 | ||
| 10133 | 6.4% | |
| 10012 | 6.3% | |
| 10010 | 6.3% | |
| 7051 | 4.4% | |
| 6901 | 4.3% | |
| Other values (13) | 24787 |
Uppercase Letter
| Value | Count | Frequency (%) |
| 3259 | ||
| 116 | 3.4% | |
| 9 | 0.3% | |
| 6 | 0.2% | |
| 3 | 0.1% | |
| 1 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| 3276 | ||
| 67 | 1.9% | |
| 64 | 1.9% | |
| 37 | 1.1% | |
| 3 | 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 3 | ||
| 3 | ||
| 3 |
Space Separator
| Value | Count | Frequency (%) |
| 30706 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 162370 | |
| Common | 34162 | 17.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| 26340 | ||
| 20224 | ||
| 16682 | ||
| 13613 | ||
| 13223 | ||
| 10133 | 6.2% | |
| 10012 | 6.2% | |
| 10010 | 6.2% | |
| 7051 | 4.3% | |
| 6901 | 4.3% | |
| Other values (19) | 28181 |
Common
| Value | Count | Frequency (%) |
| 30706 | ||
| 3276 | 9.6% | |
| 67 | 0.2% | |
| 64 | 0.2% | |
| 37 | 0.1% | |
| 3 | < 0.1% | |
| 3 | < 0.1% | |
| 3 | < 0.1% | |
| 3 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 196532 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 30706 | ||
| 26340 | ||
| 20224 | ||
| 16682 | ||
| 13613 | 6.9% | |
| 13223 | 6.7% | |
| 10133 | 5.2% | |
| 10012 | 5.1% | |
| 10010 | 5.1% | |
| 7051 | 3.6% | |
| Other values (28) | 38538 |